1. Detecting abnormal electroencephalograms using deep convolutional networks
- Author
-
Aaron F. Struck, M.J.A.M. van Putten, M.B. Westover, Haoqi Sun, K.G. van Leeuwen, Mohammad Tabaeizadeh, and Clinical Neurophysiology
- Subjects
Adult ,Male ,medicine.medical_specialty ,Adolescent ,Databases, Factual ,Electroencephalography ,Audiology ,Clinical neurophysiology ,Convolutional neural network ,Article ,050105 experimental psychology ,Machine Learning ,Young Adult ,03 medical and health sciences ,0302 clinical medicine ,Physiology (medical) ,medicine ,Humans ,0501 psychology and cognitive sciences ,Convolutional neural networks (CNN) ,Set (psychology) ,Electroencephalograms (EEG) ,Retrospective Studies ,Sleep Stages ,Epilepsy ,medicine.diagnostic_test ,Receiver operating characteristic ,business.industry ,Deep learning ,05 social sciences ,Middle Aged ,22/4 OA procedure ,Computer aided diagnosis (CAD) ,Sensory Systems ,Neurology ,Test set ,Female ,Neural Networks, Computer ,Neurology (clinical) ,Artificial intelligence ,Psychology ,business ,030217 neurology & neurosurgery - Abstract
Objectives Electroencephalography (EEG) is a central part of the medical evaluation for patients with neurological disorders. Training an algorithm to label the EEG normal vs abnormal seems challenging, because of EEG heterogeneity and dependence of contextual factors, including age and sleep stage. Our objectives were to validate prior work on an independent data set suggesting that deep learning methods can discriminate between normal vs abnormal EEGs, to understand whether age and sleep stage information can improve discrimination, and to understand what factors lead to errors. Methods We train a deep convolutional neural network on a heterogeneous set of 8522 routine EEGs from the Massachusetts General Hospital. We explore several strategies for optimizing model performance, including accounting for age and sleep stage. Results The area under the receiver operating characteristic curve (AUC) on an independent test set (n = 851) is 0.917 marginally improved by including age (AUC = 0.924), and both age and sleep stages (AUC = 0.925), though not statistically significant. Conclusions The model architecture generalizes well to an independent dataset. Adding age and sleep stage to the model does not significantly improve performance. Significance Insights learned from misclassified examples, and minimal improvement by adding sleep stage and age suggest fruitful directions for further research.
- Published
- 2019
- Full Text
- View/download PDF