Author: "Schaefferkoetter N" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Schaefferkoetter N"' showing total 6 results

Start Over Author "Schaefferkoetter N"

6 results on '"Schaefferkoetter N"'

1. Deep learning uncertainty quantification for clinical text classification.

Author: Peluso A, Danciu I, Yoon HJ, Yusof JM, Bhattacharya T, Spannaus A, Schaefferkoetter N, Durbin EB, Wu XC, Stroup A, Doherty J, Schwartz S, Wiggins C, Coyle L, Penberthy L, Tourassi GD, and Gao S
Subjects: Humans, Uncertainty, Neural Networks, Computer, Algorithms, Machine Learning, Deep Learning
Abstract: Introduction: Machine learning algorithms are expected to work side-by-side with humans in decision-making pipelines. Thus, the ability of classifiers to make reliable decisions is of paramount importance. Deep neural networks (DNNs) represent the state-of-the-art models to address real-world classification. Although the strength of activation in DNNs is often correlated with the network's confidence, in-depth analyses are needed to establish whether they are well calibrated., Method: In this paper, we demonstrate the use of DNN-based classification tools to benefit cancer registries by automating information extraction of disease at diagnosis and at surgery from electronic text pathology reports from the US National Cancer Institute (NCI) Surveillance, Epidemiology, and End Results (SEER) population-based cancer registries. In particular, we introduce multiple methods for selective classification to achieve a target level of accuracy on multiple classification tasks while minimizing the rejection amount-that is, the number of electronic pathology reports for which the model's predictions are unreliable. We evaluate the proposed methods by comparing our approach with the current in-house deep learning-based abstaining classifier., Results: Overall, all the proposed selective classification methods effectively allow for achieving the targeted level of accuracy or higher in a trade-off analysis aimed to minimize the rejection rate. On in-distribution validation and holdout test data, with all the proposed methods, we achieve on all tasks the required target level of accuracy with a lower rejection rate than the deep abstaining classifier (DAC). Interpreting the results for the out-of-distribution test data is more complex; nevertheless, in this case as well, the rejection rate from the best among the proposed methods achieving 97% accuracy or higher is lower than the rejection rate based on the DAC., Conclusions: We show that although both approaches can flag those samples that should be manually reviewed and labeled by human annotators, the newly proposed methods retain a larger fraction and do so without retraining-thus offering a reduced computational cost compared with the in-house deep learning-based abstaining classifier., Competing Interests: Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (Published by Elsevier Inc.)
Published: 2024
Full Text: View/download PDF

2. Limitations of Transformers on Clinical Text Classification.

Author: Gao S, Alawad M, Young MT, Gounley J, Schaefferkoetter N, Yoon HJ, Wu XC, Durbin EB, Doherty J, Stroup A, Coyle L, and Tourassi G
Subjects: Humans, Natural Language Processing, Neural Networks, Computer
Abstract: Bidirectional Encoder Representations from Transformers (BERT) and BERT-based approaches are the current state-of-the-art in many natural language processing (NLP) tasks; however, their application to document classification on long clinical texts is limited. In this work, we introduce four methods to scale BERT, which by default can only handle input sequences up to approximately 400 words long, to perform document classification on clinical texts several thousand words long. We compare these methods against two much simpler architectures - a word-level convolutional neural network and a hierarchical self-attention network - and show that BERT often cannot beat these simpler baselines when classifying MIMIC-III discharge summaries and SEER cancer pathology reports. In our analysis, we show that two key components of BERT - pretraining and WordPiece tokenization - may actually be inhibiting BERT's performance on clinical text classification tasks where the input document is several thousand words long and where correctly identifying labels may depend more on identifying a few key words or phrases rather than understanding the contextual meaning of sequences of text.
Published: 2021
Full Text: View/download PDF

3. Deep active learning for classifying cancer pathology reports.

Author: De Angeli K, Gao S, Alawad M, Yoon HJ, Schaefferkoetter N, Wu XC, Durbin EB, Doherty J, Stroup A, Coyle L, Penberthy L, and Tourassi G
Subjects: Algorithms, Humans, Neural Networks, Computer, Machine Learning, Neoplasms genetics, Neoplasms pathology
Abstract: Background: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model., Results: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes., Conclusions: Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling.
Published: 2021
Full Text: View/download PDF

4. Using case-level context to classify cancer pathology reports.

Author: Gao S, Alawad M, Schaefferkoetter N, Penberthy L, Wu XC, Durbin EB, Coyle L, Ramanathan A, and Tourassi G
Subjects: Histological Techniques, Humans, Natural Language Processing, SEER Program, Electronic Health Records classification, Neoplasms pathology
Abstract: Individual electronic health records (EHRs) and clinical reports are often part of a larger sequence-for example, a single patient may generate multiple reports over the trajectory of a disease. In applications such as cancer pathology reports, it is necessary not only to extract information from individual reports, but also to capture aggregate information regarding the entire cancer case based off case-level context from all reports in the sequence. In this paper, we introduce a simple modular add-on for capturing case-level context that is designed to be compatible with most existing deep learning architectures for text classification on individual reports. We test our approach on a corpus of 431,433 cancer pathology reports, and we show that incorporating case-level context significantly boosts classification accuracy across six classification tasks-site, subsite, laterality, histology, behavior, and grade. We expect that with minimal modifications, our add-on can be applied towards a wide range of other clinical text-based tasks., Competing Interests: Author LC is employed by the commercial company Information Management Services Inc (IMS). This does not alter our adherence to PLOS ONE policies on sharing data and materials.
Published: 2020
Full Text: View/download PDF

5. Classifying cancer pathology reports with hierarchical self-attention networks.

Author: Gao S, Qiu JX, Alawad M, Hinkle JD, Schaefferkoetter N, Yoon HJ, Christian B, Fearn PA, Penberthy L, Wu XC, Coyle L, Tourassi G, and Ramanathan A
Subjects: Deep Learning, Humans, Natural Language Processing, Neoplasms classification, Neural Networks, Computer, Neoplasms pathology
Abstract: We introduce a deep learning architecture, hierarchical self-attention networks (HiSANs), designed for classifying pathology reports and show how its unique architecture leads to a new state-of-the-art in accuracy, faster training, and clear interpretability. We evaluate performance on a corpus of 374,899 pathology reports obtained from the National Cancer Institute's (NCI) Surveillance, Epidemiology, and End Results (SEER) program. Each pathology report is associated with five clinical classification tasks - site, laterality, behavior, histology, and grade. We compare the performance of the HiSAN against other machine learning and deep learning approaches commonly used on medical text data - Naive Bayes, logistic regression, convolutional neural networks, and hierarchical attention networks (the previous state-of-the-art). We show that HiSANs are superior to other machine learning and deep learning text classifiers in both accuracy and macro F-score across all five classification tasks. Compared to the previous state-of-the-art, hierarchical attention networks, HiSANs not only are an order of magnitude faster to train, but also achieve about 1% better relative accuracy and 5% better relative macro F-score., (Copyright © 2019 The Authors. Published by Elsevier B.V. All rights reserved.)
Published: 2019
Full Text: View/download PDF

6. Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.

Author: Alawad M, Gao S, Qiu J, Schaefferkoetter N, Hinkle JD, Yoon HJ, Christian JB, Wu XC, Durbin EB, Jeong JC, Hands I, Rust D, and Tourassi G
Abstract: Automated text information extraction from cancer pathology reports is an active area of research to support national cancer surveillance. A well-known challenge is how to develop information extraction tools with robust performance across cancer registries. In this study we investigated whether transfer learning (TL) with a convolutional neural network (CNN) can facilitate cross-registry knowledge sharing. Specifically, we performed a series of experiments to determine whether a CNN trained with single-registry data is capable of transferring knowledge to another registry or whether developing a cross-registry knowledge database produces a more effective and generalizable model. Using data from two cancer registries and primary tumor site and topography as the information extraction task of interest, our study showed that TL results in 6.90% and 17.22% improvement of classification macro F-score over the baseline single-registry models. Detailed analysis illustrated that the observed improvement is evident in the low prevalence classes.
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results on '"Schaefferkoetter N"'

1. Deep learning uncertainty quantification for clinical text classification.

2. Limitations of Transformers on Clinical Text Classification.

3. Deep active learning for classifying cancer pathology reports.

4. Using case-level context to classify cancer pathology reports.

5. Classifying cancer pathology reports with hierarchical self-attention networks.

6. Deep Transfer Learning Across Cancer Registries for Information Extraction from Pathology Reports.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

6 results on '"Schaefferkoetter N"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources