Descriptor: "clinical text" / Journal: journal of biomedical informatics - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"clinical text"' showing total 12 results

Start Over Descriptor "clinical text" Journal journal of biomedical informatics

12 results on '"clinical text"'

1. Discontinuous named entities in clinical text: A systematic literature review

Author: Alhassan, Areej, Schlegel, Viktor, Aloud, Monira, Batista-Navarro, Riza, and Nenadic, Goran
Published: 2025
Full Text: View/download PDF

2. On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions.

Author: Oronoz, Maite, Gojenola, Koldo, Pérez, Alicia, de Ilarraza, Arantza Díaz, and Casillas, Arantza
Abstract: The advances achieved in Natural Language Processing make it possible to automatically mine information from electronically created documents. Many Natural Language Processing methods that extract information from texts make use of annotated corpora, but these are scarce in the clinical domain due to legal and ethical issues. In this paper we present the creation of the IxaMed-GS gold standard composed of real electronic health records written in Spanish and manually annotated by experts in pharmacology and pharmacovigilance. The experts mainly annotated entities related to diseases and drugs, but also relationships between entities indicating adverse drug reaction events. To help the experts in the annotation task, we adapted a general corpus linguistic analyzer to the medical domain. The quality of the annotation process in the IxaMed-GS corpus has been assessed by measuring the inter-annotator agreement, which was 90.53% for entities and 82.86% for events. In addition, the corpus has been used for the automatic extraction of adverse drug reaction events using machine learning. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

3. Using large clinical corpora for query expansion in text-based cohort identification.

Author: Zhu, Dongqing, Wu, Stephen, Carterette, Ben, and Liu, Hongfang
Abstract: Highlights: [•] Demonstrated utility of an in-domain collection (clinical text) for query expansion. [•] Analyzed effect of external collection size on a mixture of relevance models. [•] Any existing query expansion configuration can benefit from an indomain collection. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

4. A controlled greedy supervised approach for co-reference resolution on clinical text.

Author: Chowdhury, Md. Faisal Mahbub and Zweigenbaum, Pierre
Abstract: Abstract: Identification of co-referent entity mentions inside text has significant importance for other natural language processing (NLP) tasks (e.g. event linking). However, this task, known as co-reference resolution, remains a complex problem, partly because of the confusion over different evaluation metrics and partly because the well-researched existing methodologies do not perform well on new domains such as clinical records. This paper presents a variant of the influential mention-pair model for co-reference resolution. Using a series of linguistically and semantically motivated constraints, the proposed approach controls generation of less-informative/sub-optimal training and test instances. Additionally, the approach also introduces some aggressive greedy strategies in chain clustering. The proposed approach has been tested on the official test corpus of the recently held i2b2/VA 2011 challenge. It achieves an unweighted average F 1 score of 0.895, calculated from multiple evaluation metrics (MUC, B 3 and CEAF scores). These results are comparable to the best systems of the challenge. What makes our proposed system distinct is that it also achieves high average F 1 scores for each individual chain type (Test: 0.897, Person: 0.852, Problem: 0.855, Treatment: 0.884). Unlike other works, it obtains good scores for each of the individual metrics rather than being biased towards a particular metric. [Copyright &y& Elsevier]
Published: 2013
Full Text: View/download PDF

5. UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text.

Author: Demner-Fushman, Dina, Mork, James G., Shooshan, Sonya E., and Aronson, Alan R.
Abstract: Abstract: Identification of medical terms in free text is a first step in such Natural Language Processing (NLP) tasks as automatic indexing of biomedical literature and extraction of patients’ problem lists from the text of clinical notes. Many tools developed to perform these tasks use biomedical knowledge encoded in the Unified Medical Language System (UMLS) Metathesaurus. We continue our exploration of automatic approaches to creation of subsets (UMLS content views) which can support NLP processing of either the biomedical literature or clinical text. We found that suppression of highly ambiguous terms in the conservative AutoFilter content view can partially replace manual filtering for literature applications, and suppression of two character mappings in the same content view achieves 89.5% precision at 78.6% recall for clinical applications. [Copyright &y& Elsevier]
Published: 2010
Full Text: View/download PDF

6. Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries

Author: Ghada Alfattni, Niels Peek, and Goran Nenadic
Subjects: Relation (database), Computer science, Health Informatics, Context (language use), Discharge summaries, computer.software_genre, NLP, Field (computer science), TLINKs, Humans, Language, Natural Language Processing, business.industry, Deep learning, Clinical text, Relationship extraction, Patient Discharge, Semantics, Computer Science Applications, Memory, Short-Term, Artificial intelligence, Source text, business, computer, Natural language processing, Word (computer architecture), Sentence
Abstract: Temporal relation extraction between health-related events is a widely studied task in clinical Natural Language Processing (NLP). The current state-of-the-art methods mostly rely on engineered features (i.e., rule-based modelling) and sequence modelling, which often encodes a source sentence into a single fixed-length context. An obvious disadvantage of this fixed-length context design is its incapability to model longer sentences, as important temporal information in the clinical text may appear at different positions. To address this issue, we propose an Attention-based Bidirectional Long Short-Term Memory (Att-BiLSTM) model to enable learning the important semantic information in long source text segments and to better determine which parts of the text are most important. We experimented with two embeddings and compared the performances to traditional state-of-the-art methods that require elaborate linguistic pre-processing and hand-engineered features. The experimental results on the i2b2 2012 temporal relation test corpus show that the proposed method achieves a significant improvement with an F-score of 0.811, which is at least 10% better than state-of-the-art in the field. We show that the model can be remarkably effective at classifying temporal relations when provided with word embeddings trained on corpora in a general domain. Finally, we perform an error analysis to gain insight into the common errors made by the model.
Published: 2021
Full Text: View/download PDF

7. Building a semantically annotated corpus of clinical texts.

Author: Roberts, Angus, Gaizauskas, Robert, Hepple, Mark, Demetriou, George, Guo, Yikun, Roberts, Ian, and Setzer, Andrea
Abstract: Abstract: In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains. [Copyright &y& Elsevier]
Published: 2009
Full Text: View/download PDF

8. Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.

Author: Alfattni, Ghada, Peek, Niels, and Nenadic, Goran
Abstract: Temporal relation extraction between health-related events is a widely studied task in clinical Natural Language Processing (NLP). The current state-of-the-art methods mostly rely on engineered features (i.e., rule-based modelling) and sequence modelling, which often encodes a source sentence into a single fixed-length context. An obvious disadvantage of this fixed-length context design is its incapability to model longer sentences, as important temporal information in the clinical text may appear at different positions. To address this issue, we propose an Attention-based Bidirectional Long Short-Term Memory (Att-BiLSTM) model to enable learning the important semantic information in long source text segments and to better determine which parts of the text are most important. We experimented with two embeddings and compared the performances to traditional state-of-the-art methods that require elaborate linguistic pre-processing and hand-engineered features. The experimental results on the i2b2 2012 temporal relation test corpus show that the proposed method achieves a significant improvement with an F-score of 0.811, which is at least 10% better than state-of-the-art in the field. We show that the model can be remarkably effective at classifying temporal relations when provided with word embeddings trained on corpora in a general domain. Finally, we perform an error analysis to gain insight into the common errors made by the model. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

9. Extracting and classifying diagnosis dates from clinical notes: A case study.

Author: Fu, Julia T., Sholle, Evan, Krichevsky, Spencer, Scandura, Joseph, and Campion, Thomas R.
Abstract: Myeloproliferative neoplasms (MPNs) are chronic hematologic malignancies that may progress over long disease courses. The original date of diagnosis is an important piece of information for patient care and research, but is not consistently documented. We describe an attempt to build a pipeline for extracting dates with natural language processing (NLP) tools and techniques and classifying them as relevant diagnoses or not. Inaccurate and incomplete date extraction and interpretation impacted the performance of the overall pipeline. Existing lightweight Python packages tended to have low specificity for identifying and interpreting partial and relative dates in clinical text. A rules-based regular expression (regex) approach achieved recall of 83.0% on dates manually annotated as diagnosis dates, and 77.4% on all annotated dates. With only 3.8% of annotated dates representing initial MPN diagnoses, additional methods of targeting candidate date instances may alleviate noise and class imbalance. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

10. An enhanced CRFs-based system for information extraction from radiology reports

Author: Andrea Esuli, Diego Marcheggiani, and Fabrizio Sebastiani
Subjects: Conditional random field, medicine.medical_specialty, Information extraction, Computer science, Health Informatics, Conditional random fields, computer.software_genre, Machine learning, Domain (software engineering), Task (project management), Annotation, medicine, Data Mining, Computer Simulation, CRFS, business.industry, Supervised learning, Clinical text, Computer Science Applications, Feature (computer vision), Artificial intelligence, Radiology, business, computer, Natural language processing
Abstract: We discuss the problem of performing information extraction from free-text radiology reports via supervised learning. In this task, segments of text (not necessarily coinciding with entire sentences, and possibly crossing sentence boundaries) need to be annotated with tags representing concepts of interest in the radiological domain. In this paper we present two novel approaches to IE for radiology reports: (i) a cascaded, two-stage method based on pipelining two taggers generated via the well known linear-chain conditional random fields (LC-CRFs) learner and (ii) a confidence-weighted ensemble method that combines standard LC-CRFs and the proposed two-stage method. We also report on the use of “positional features”, a novel type of feature intended to aid in the automatic annotation of texts in which the instances of a given concept may be hypothesized to systematically occur in specific areas of the text. We present experiments on a dataset of mammography reports in which the proposed ensemble is shown to outperform a traditional, single-stage CRFs system in two different, applicatively interesting scenarios.
Published: 2012
Full Text: View/download PDF

11. Building a semantically annotated corpus of clinical texts

Author: Angus Roberts, Mark Hepple, Yikun Guo, Robert Gaizauskas, George Demetriou, Ian Roberts, and Andrea Setzer
Subjects: Scheme (programming language), Text corpus, Biomedical Research, Information extraction, Text mining, Computer science, Abstracting and Indexing, Information Storage and Retrieval, Guidelines as Topic, Health Informatics, Temporal annotation, computer.software_genre, Semantics, Medical Records, Annotation, User-Computer Interface, Text processing, Corpora, Component (UML), Neoplasms, Terminology as Topic, Humans, Evaluation, computer.programming_language, Internet, Information retrieval, Models, Statistical, Semantic annotation, business.industry, Natural language processing, Clinical text, Computer Science Applications, Artificial intelligence, business, computer, Gold standards, Annotation guidelines
Abstract: In this paper, we describe the construction of a semantically annotated corpus of clinical texts for use in the development and evaluation of systems for automatically extracting clinically significant information from the textual component of patient records. The paper details the sampling of textual material from a collection of 20,000 cancer patient records, the development of a semantic annotation scheme, the annotation methodology, the distribution of annotations in the final corpus, and the use of the corpus for development of an adaptive information extraction system. The resulting corpus is the most richly semantically annotated resource for clinical text processing built to date, whose value has been demonstrated through its use in developing an effective information extraction system. The detailed presentation of our corpus construction and annotation methodology will be of value to others seeking to build high-quality semantically annotated corpora in biomedical domains.
Full Text: View/download PDF

12. UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text

Author: Alan R. Aronson, Sonya E. Shooshan, James G. Mork, and Dina Demner-Fushman
Subjects: Biomedical knowledge, Metathesaurus, Computer science, Content views, Information Storage and Retrieval, Health Informatics, computer.software_genre, Article, UMLS, Text messaging, Natural Language Processing, Information retrieval, Recall, Character (computing), business.industry, Unified Medical Language System, Search engine indexing, Publications, Clinical text, Computer Science Applications, Identification (information), Automatic indexing, Indexing, Artificial intelligence, business, computer, Natural language processing
Abstract: Identification of medical terms in free text is a first step in such Natural Language Processing (NLP) tasks as automatic indexing of biomedical literature and extraction of patients’ problem lists from the text of clinical notes. Many tools developed to perform these tasks use biomedical knowledge encoded in the Unified Medical Language System (UMLS) Metathesaurus. We continue our exploration of automatic approaches to creation of subsets (UMLS content views) which can support NLP processing of either the biomedical literature or clinical text. We found that suppression of highly ambiguous terms in the conservative AutoFilter content view can partially replace manual filtering for literature applications, and suppression of two character mappings in the same content view achieves 89.5% precision at 78.6% recall for clinical applications.
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

12 results on '"clinical text"'

1. Discontinuous named entities in clinical text: A systematic literature review

2. On the creation of a clinical gold standard corpus in Spanish: Mining adverse drug reactions.

3. Using large clinical corpora for query expansion in text-based cohort identification.

4. A controlled greedy supervised approach for co-reference resolution on clinical text.

5. UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text.

6. Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries

7. Building a semantically annotated corpus of clinical texts.

8. Attention-based bidirectional long short-term memory networks for extracting temporal relationships from clinical discharge summaries.

9. Extracting and classifying diagnosis dates from clinical notes: A case study.

10. An enhanced CRFs-based system for information extraction from radiology reports

11. Building a semantically annotated corpus of clinical texts

12. UMLS content views appropriate for NLP processing of the biomedical literature vs. clinical text

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

12 results on '"clinical text"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources