13 results on '"Sumithra, Velupillai"'
Search Results
2. Development of a Corpus Annotated With Mentions of Pain in Mental Health Records: Natural Language Processing Approach
- Author
-
Jaya Chaturvedi, Natalia Chance, Luwaiza Mirza, Veshalee Vernugopan, Sumithra Velupillai, Robert Stewart, and Angus Roberts
- Subjects
Medicine - Abstract
BackgroundPain is a widespread issue, with 20% of adults (1 in 5) experiencing it globally. A strong association has been demonstrated between pain and mental health conditions, and this association is known to exacerbate disability and impairment. Pain is also known to be strongly related to emotions, which can lead to damaging consequences. As pain is a common reason for people to access health care facilities, electronic health records (EHRs) are a potential source of information on this pain. Mental health EHRs could be particularly beneficial since they can show the overlap of pain with mental health. Most mental health EHRs contain the majority of their information within the free-text sections of the records. However, it is challenging to extract information from free text. Natural language processing (NLP) methods are therefore required to extract this information from the text. ObjectiveThis research describes the development of a corpus of manually labeled mentions of pain and pain-related entities from the documents of a mental health EHR database, for use in the development and evaluation of future NLP methods. MethodsThe EHR database used, Clinical Record Interactive Search, consists of anonymized patient records from The South London and Maudsley National Health Service Foundation Trust in the United Kingdom. The corpus was developed through a process of manual annotation where pain mentions were marked as relevant (ie, referring to physical pain afflicting the patient), negated (ie, indicating absence of pain), or not relevant (ie, referring to pain affecting someone other than the patient, or metaphorical and hypothetical mentions). Relevant mentions were also annotated with additional attributes such as anatomical location affected by pain, pain character, and pain management measures, if mentioned. ResultsA total of 5644 annotations were collected from 1985 documents (723 patients). Over 70% (n=4028) of the mentions found within the documents were annotated as relevant, and about half of these mentions also included the anatomical location affected by the pain. The most common pain character was chronic pain, and the most commonly mentioned anatomical location was the chest. Most annotations (n=1857, 33%) were from patients who had a primary diagnosis of mood disorders (International Classification of Diseases—10th edition, chapter F30-39). ConclusionsThis research has helped better understand how pain is mentioned within the context of mental health EHRs and provided insight into the kind of information that is typically mentioned around pain in such a data source. In future work, the extracted information will be used to develop and evaluate a machine learning–based NLP application to automatically extract relevant pain information from EHR databases.
- Published
- 2023
- Full Text
- View/download PDF
3. Development of a Knowledge Graph Embeddings Model for Pain.
- Author
-
Jaya Chaturvedi, Tao Wang 0036, Sumithra Velupillai, Robert Stewart 0002, and Angus Roberts
- Published
- 2023
- Full Text
- View/download PDF
4. Sample Size in Natural Language Processing within Healthcare Research.
- Author
-
Jaya Chaturvedi, Diana Shamsutdinova, Felix Zimmer, Sumithra Velupillai, Daniel Stahl, Robert Stewart 0002, and Angus Roberts
- Published
- 2023
- Full Text
- View/download PDF
5. Autism spectrum disorders as a risk factor for adolescent self-harm: a retrospective cohort study of 113,286 young people in the UK
- Author
-
Emily Widnall, Sophie Epstein, Catherine Polling, Sumithra Velupillai, Amelia Jewell, Rina Dutta, Emily Simonoff, Robert Stewart, Ruth Gilbert, Tamsin Ford, Matthew Hotopf, Richard D. Hayes, and Johnny Downs
- Subjects
Child and adolescent mental health ,Epidemiology ,Autism spectrum disorders ,Education ,Data linkage ,Medicine - Abstract
Abstract Background Individuals with autism spectrum disorder (ASD) are at particularly high risk of suicide and suicide attempts. Presentation to a hospital with self-harm is one of the strongest risk factors for later suicide. We describe the use of a novel data linkage between routinely collected education data and child and adolescent mental health data to examine whether adolescents with ASD are at higher risk than the general population of presenting to emergency care with self-harm. Methods A retrospective cohort study was conducted on the population aged 11–17 resident in four South London boroughs between January 2009 and March 2013, attending state secondary schools, identified in the National Pupil Database (NPD). Exposure data on ASD status were derived from the NPD. We used Cox regression to model time to first self-harm presentation to the Emergency Department (ED). Results One thousand twenty adolescents presented to the ED with self-harm, and 763 matched to the NPD. The sample for analysis included 113,286 adolescents (2.2% with ASD). For boys only, there was an increased risk of self-harm associated with ASD (adjusted hazard ratio 2·79, 95% CI 1·40–5·57, P
- Published
- 2022
- Full Text
- View/download PDF
6. Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data
- Author
-
Johnny Downs, Robert Stewart, Alice Wickersham, Sumithra Velupillai, Lucile Ter-Minassian, Natalia Viani, and Lauren Cross
- Subjects
Medicine - Abstract
Objectives Attention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health and education data resource, we examined how machine learning (ML) approaches can predict risk of ADHD.Design Retrospective population cohort study.Setting South London (2007–2013).Participants n=56 258 pupils with linked education and health data.Primary outcome measures Using area under the curve (AUC), we compared the predictive accuracy of four ML models and one neural network for ADHD diagnosis. Ethnic group and language biases were weighted using a fair pre-processing algorithm.Results Random forest and logistic regression prediction models provided the highest predictive accuracy for ADHD in population samples (AUC 0.86 and 0.86, respectively) and clinical samples (AUC 0.72 and 0.70). Precision-recall curve analyses were less favourable. Sociodemographic biases were effectively reduced by a fair pre-processing algorithm without loss of accuracy.Conclusions ML approaches using linked routinely collected education and health data offer accurate, low-cost and scalable prediction models of ADHD. These approaches could help identify areas of need and inform resource allocation. Introducing ‘fairness weighting’ attenuates some sociodemographic biases which would otherwise underestimate ADHD risk within minority groups.
- Published
- 2022
- Full Text
- View/download PDF
7. Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records
- Author
-
Marika Cusick, Sumithra Velupillai, Johnny Downs, Thomas R. Campion, Jr., Evan T. Sholle, Rina Dutta, and Jyotishman Pathak
- Subjects
Suicide ,Natural language processing ,Electronic health records ,Portability ,Mental healing ,RZ400-408 - Abstract
Background: In the global effort to prevent death by suicide, many academic medical institutions are implementing natural language processing (NLP) approaches to detect suicidality from unstructured clinical text in electronic health records (EHRs), with the hope of targeting timely, preventative interventions to individuals most at risk of suicide. Despite the international need, the development of these NLP approaches in EHRs has been largely local and not shared across healthcare systems. Methods: In this study, we developed a process to share NLP approaches that were individually developed at King's College London (KCL), UK and Weill Cornell Medicine (WCM), US - two academic medical centers based in different countries with vastly different healthcare systems. We tested and compared the algorithms’ performance on manually annotated clinical notes (KCL: n = 4,911 and WCM = 837). Results: After a successful technical porting of the NLP approaches, our quantitative evaluation determined that independently developed NLP approaches can detect suicidality at another healthcare organization with a different EHR system, clinical documentation processes, and culture, yet do not achieve the same level of success as at the institution where the NLP algorithm was developed (KCL approach: F1-score 0.85 vs. 0.68, WCM approach: F1-score 0.87 vs. 0.72). Limitations: Independent NLP algorithm development and patient cohort selection at the two institutions comprised direct comparability. Conclusions: Shared use of these NLP approaches is a critical step forward towards improving data-driven algorithms for early suicide risk identification and timely prevention.
- Published
- 2022
- Full Text
- View/download PDF
8. Evaluating physical urban features in several mental illnesses using electronic health record data
- Author
-
Zahra Mahabadi, Maryam Mahabadi, Sumithra Velupillai, Angus Roberts, Philip McGuire, Zina Ibrahim, and Rashmi Patel
- Subjects
geospatial informatics ,schizophrenia ,bipolar disorder ,psychosis ,machine learning ,ehr (electric heath record) ,Medicine ,Public aspects of medicine ,RA1-1270 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
ObjectivesUnderstanding the potential impact of physical characteristics of the urban environment on clinical outcomes on several mental illnesses.Materials and MethodsPhysical features of the urban environment were examined as predictors for affective and non-affective several mental illnesses (SMI), the number and length of psychiatric hospital admissions, and the number of short and long-acting injectable antipsychotic prescriptions. In addition, the urban features with the greatest weight in the predicted model were determined. The data included 28 urban features and 6 clinical variables obtained from 30,210 people with SMI receiving care from the South London and Maudsley NHS Foundation Trust (SLaM) using the Clinical Record Interactive Search (CRIS) tool. Five machine learning regression models were evaluated for the highest prediction accuracy followed by the Self-Organising Map (SOM) to represent the results visually.ResultsThe prevalence of SMI, number and duration of psychiatric hospital admission, and antipsychotic prescribing were greater in urban areas. However, machine learning analysis was unable to accurately predict clinical outcomes using urban environmental data.DiscussionThe urban environment is associated with an increased prevalence of SMI. However, urban features alone cannot explain the variation observed in psychotic disorder prevalence or clinical outcomes measured through psychiatric hospitalisation or exposure to antipsychotic treatments.ConclusionUrban areas are associated with a greater prevalence of SMI but clinical outcomes are likely to depend on a combination of urban and individual patient-level factors. Future mental healthcare service planning should focus on providing appropriate resources to people with SMI in urban environments.
- Published
- 2022
- Full Text
- View/download PDF
9. Can natural language processing models extract and classify instances of interpersonal violence in mental healthcare electronic records: an applied evaluative study
- Author
-
Robert Stewart, Angus Roberts, Vishal Bhavsar, Sumithra Velupillai, Aurelie Mascio, Riley Botelle, Giouliana Kadra-Scalzo, and Marcus V Williams
- Subjects
Medicine - Abstract
Objective This paper evaluates the application of a natural language processing (NLP) model for extracting clinical text referring to interpersonal violence using electronic health records (EHRs) from a large mental healthcare provider.Design A multidisciplinary team iteratively developed guidelines for annotating clinical text referring to violence. Keywords were used to generate a dataset which was annotated (ie, classified as affirmed, negated or irrelevant) for: presence of violence, patient status (ie, as perpetrator, witness and/or victim of violence) and violence type (domestic, physical and/or sexual). An NLP approach using a pretrained transformer model, BioBERT (Bidirectional Encoder Representations from Transformers for Biomedical Text Mining) was fine-tuned on the annotated dataset and evaluated using 10-fold cross-validation.Setting We used the Clinical Records Interactive Search (CRIS) database, comprising over 500 000 de-identified EHRs of patients within the South London and Maudsley NHS Foundation Trust, a specialist mental healthcare provider serving an urban catchment area.Participants Searches of CRIS were carried out based on 17 predefined keywords. Randomly selected text fragments were taken from the results for each keyword, amounting to 3771 text fragments from the records of 2832 patients.Outcome measures We estimated precision, recall and F1 score for each NLP model. We examined sociodemographic and clinical variables in patients giving rise to the text data, and frequencies for each annotated violence characteristic.Results Binary classification models were developed for six labels (violence presence, perpetrator, victim, domestic, physical and sexual). Among annotations affirmed for the presence of any violence, 78% (1724) referred to physical violence, 61% (1350) referred to patients as perpetrator and 33% (731) to domestic violence. NLP models’ precision ranged from 89% (perpetrator) to 98% (sexual); recall ranged from 89% (victim, perpetrator) to 97% (sexual).Conclusions State of the art NLP models can extract and classify clinical text on violence from EHRs at acceptable levels of scale, efficiency and accuracy.
- Published
- 2022
- Full Text
- View/download PDF
10. Development of a Corpus Annotated with Mentions of Pain in Mental Health Records (Preprint)
- Author
-
Jaya Chaturvedi, Natalia Chance, Luwaiza Mirza, Veshalee Vernugopan, Sumithra Velupillai, Robert Stewart, and Angus Roberts
- Abstract
UNSTRUCTURED Pain is a widespread issue, with 20% of adults suffering globally. A strong association has been demonstrated between pain and mental health conditions, and this association is known to exacerbate disability and impairment. Pain is also also known to be strongly related to emotions, which can lead to damaging consequences. As pain is a common reason for people to access healthcare facilities, electronic health records (EHRs) are a potential source of information on this pain. Mental health EHRs could be particularly beneficial since they can show the overlap of pain with mental health. Most mental health EHRs contain the majority of their information within the free-text sections of the records. However, it is challenging to extract information from free-text. Natural language processing (NLP) methods are therefore required to extract this information from the text. This research describes the development of a corpus of manually labelled mentions of pain and pain-related entities from the documents of a mental health EHR database, for use in the development and evaluation of future NLP methods. The EHR database used, CRIS (Clinical Record Interactive Search), consists of anonymised patient records from The South London and Maudsley (SLaM) NHS Foundation Trust in the UK. The corpus was developed through a process of manual annotation where pain mentions were marked as relevant (i.e., referring to physical pain afflicting the patient), negated (i.e., indicating absence of pain) or not-relevant (i.e. referring to pain affecting someone other than the patient, or metaphorical and hypothetical mentions). Relevant mentions were also annotated with additional attributes such as anatomical location affected by pain, pain character, and pain management measures, if mentioned. Over 70% of the mentions found within the documents were annotated as relevant, and about half of these mentions also included the anatomical location affected by the pain. In future work, the extracted information will be used to develop and evaluate a machine learning based NLP application to automatically extract relevant pain information from EHR databases.
- Published
- 2023
- Full Text
- View/download PDF
11. Using natural language processing to extract self-harm and suicidality data from a clinical sample of patients with eating disorders: a retrospective cohort study
- Author
-
Charlotte Cliffe, Aida Seyedsalehi, Katerina Vardavoulia, André Bittar, Sumithra Velupillai, Hitesh Shetty, Ulrike Schmidt, and Rina Dutta
- Subjects
suicide & self-harm ,Reproducibility of Results ,General Medicine ,eating disorders ,State Medicine ,Suicidal Ideation ,Feeding and Eating Disorders ,Suicide ,Mental Health ,Humans ,epidemiology ,biotechnology & bioinformatics ,Self-Injurious Behavior ,Natural Language Processing ,Retrospective Studies - Abstract
ObjectivesThe objective of this study was to determine risk factors for those diagnosed with eating disorders who report self-harm and suicidality.Design and settingThis study was a retrospective cohort study within a secondary mental health service, South London and Maudsley National Health Service Trust.ParticipantsAll diagnosed with an F50 diagnosis of eating disorder from January 2009 to September 2019 were included.Intervention and measuresElectronic health records (EHRs) for these patients were extracted and two natural language processing tools were used to determine documentation of self-harm and suicidality in their clinical notes. These tools were validated manually for attribute agreement scores within this study.ResultsThe attribute agreements for precision of positive mentions of self-harm were 0.96 and for suicidality were 0.80; this demonstrates a ‘near perfect’ and ‘strong’ agreement and highlights the reliability of the tools in identifying the EHRs reporting self-harm or suicidality. There were 7434 patients with EHRs available and diagnosed with eating disorders included in the study from the dates January 2007 to September 2019. Of these, 4591 (61.8%) had a mention of self-harm within their records and 4764 (64.0%) had a mention of suicidality; 3899 (52.4%) had mentions of both. Patients reporting either self-harm or suicidality were more likely to have a diagnosis of anorexia nervosa (AN) (self-harm, AN OR=3.44, 95% CI 1.05 to 11.3, p=0.04; suicidality, AN OR=8.20, 95% CI 2.17 to 30.1; p=0.002). They were also more likely to have a diagnosis of borderline personality disorder (p≤0.001), bipolar disorder (pConclusionA high percentage of patients (>60%) diagnosed with eating disorders report either self-harm or suicidal thoughts. Relative to other eating disorders, those diagnosed with AN were more likely to report either self-harm or suicidal thoughts. Psychiatric comorbidity, in particular borderline personality disorder and substance misuse, was also associated with an increase risk in self-harm and suicidality. Therefore, risk assessment among patients diagnosed with eating disorders is crucial.
- Published
- 2021
12. Assessing machine learning for fair prediction of ADHD in school pupils using a retrospective cohort study of linked education and healthcare data
- Author
-
Lucile Ter-Minassian, Natalia Viani, Alice Wickersham, Lauren Cross, Robert Stewart, Sumithra Velupillai, Johnny Downs, Wickersham, Alice [0000-0002-7402-7690], Downs, Johnny [0000-0002-8061-295X], and Apollo - University of Cambridge Repository
- Subjects
Cohort Studies ,Machine Learning ,Schools ,MENTAL HEALTH ,Child & adolescent psychiatry ,Attention Deficit Disorder with Hyperactivity ,Humans ,EPIDEMIOLOGY ,General Medicine ,Child ,Delivery of Health Care ,Retrospective Studies - Abstract
ObjectivesAttention deficit hyperactivity disorder (ADHD) is a prevalent childhood disorder, but often goes unrecognised and untreated. To improve access to services, accurate predictions of populations at high risk of ADHD are needed for effective resource allocation. Using a unique linked health and education data resource, we examined how machine learning (ML) approaches can predict risk of ADHD.DesignRetrospective population cohort study.SettingSouth London (2007–2013).Participantsn=56 258 pupils with linked education and health data.Primary outcome measuresUsing area under the curve (AUC), we compared the predictive accuracy of four ML models and one neural network for ADHD diagnosis. Ethnic group and language biases were weighted using a fair pre-processing algorithm.ResultsRandom forest and logistic regression prediction models provided the highest predictive accuracy for ADHD in population samples (AUC 0.86 and 0.86, respectively) and clinical samples (AUC 0.72 and 0.70). Precision-recall curve analyses were less favourable. Sociodemographic biases were effectively reduced by a fair pre-processing algorithm without loss of accuracy.ConclusionsML approaches using linked routinely collected education and health data offer accurate, low-cost and scalable prediction models of ADHD. These approaches could help identify areas of need and inform resource allocation. Introducing ‘fairness weighting’ attenuates some sociodemographic biases which would otherwise underestimate ADHD risk within minority groups.
- Published
- 2022
- Full Text
- View/download PDF
13. Quoted text in the mental healthcare electronic record: an analysis of the distribution and content of single-word quotations
- Author
-
Lasantha Jayasinghe, Sumithra Velupillai, and Robert Stewart
- Subjects
Health Informatics ,General Medicine ,mental health ,psychiatry - Abstract
ObjectiveTo investigate the distribution and content of quoted text within the electronic health records (EHRs) using a previously developed natural language processing tool to generate a database of quotations.Designχ2 and logistic regression were used to assess the profile of patients receiving mental healthcare for whom quotations exist. K-means clustering using pre-trained word embeddings developed on general discharge summaries and psychosis specific mental health records were used to group one-word quotations into semantically similar groups and labelled by human subjective judgement.SettingEHRs from a large mental healthcare provider serving a geographic catchment area of 1.3 million residents in South London.ParticipantsFor analysis of distribution, 33 499 individuals receiving mental healthcare on 30 June 2019 in South London and Maudsley. For analysis of content, 1587 unique lemmatised words, appearing a minimum of 20 times on the database of quotations created on 16 January 2020.ResultsThe strongest individual indicator of quoted text is inpatient care in the preceding 12 months (OR 9.79, 95% CI 7.84 to 12.23). Next highest indicator is ethnicity with those with a black background more likely to have quoted text in comparison to white background (OR 2.20, 95% CI 2.08 to 2.33). Both are attenuated slightly in the adjusted model. Early psychosis intervention word embeddings subjectively produced categories pertaining to: mental illness, verbs, negative sentiment, people/relationships, mixed sentiment, aggression/violence and negative connotation.ConclusionsThe findings that inpatients and those from a black ethnic background more commonly have quoted text raise important questions around where clinical attention is focused and whether this may point to any systematic bias. Our study also shows that word embeddings trained on early psychosis intervention records are useful in categorising even small subsets of the clinical records represented by one-word quotations.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.