Back to Search
Start Over
Evaluating resampling methods and structured features to improve fall incident report identification by the severity level
- Source :
- J Am Med Inform Assoc
- Publication Year :
- 2021
- Publisher :
- Oxford University Press (OUP), 2021.
-
Abstract
- Objective This study aims to improve the classification of the fall incident severity level by considering data imbalance issues and structured features through machine learning. Materials and Methods We present an incident report classification (IRC) framework to classify the in-hospital fall incident severity level by addressing the imbalanced class problem and incorporating structured attributes. After text preprocessing, bag-of-words features, structured text features, and structured clinical features were extracted from the reports. Next, resampling techniques were incorporated into the training process. Machine learning algorithms were used to build classification models. IRC systems were trained, validated, and tested using a repeated and randomly stratified shuffle-split cross-validation method. Finally, we evaluated the system performance using the F1-measure, precision, and recall over 15 stratified test sets. Results The experimental results demonstrated that the classification system setting considering both data imbalance issues and structured features outperformed the other system settings (with a mean macro-averaged F1-measure of 0.733). Considering the structured features and resampling techniques, this classification system setting significantly improved the mean F1-measure for the rare class by 30.88% (P value Conclusions Structured features provide essential information for categorizing the fall incident severity level. Resampling methods help rebalance the class distribution of the original incident report data, which improves the performance of machine learning models. The IRC framework presented in this study effectively automates the identification of fall incident reports by the severity level.
- Subjects :
- Computer science
government.form_of_government
Health Informatics
02 engineering and technology
Research and Applications
Machine learning
computer.software_genre
Machine Learning
03 medical and health sciences
0302 clinical medicine
Resampling
0202 electrical engineering, electronic engineering, information engineering
Preprocessor
030212 general & internal medicine
p-value
Risk Management
business.industry
Class (biology)
Random forest
Identification (information)
Structured text
government
020201 artificial intelligence & image processing
Artificial intelligence
business
computer
Algorithms
Incident report
Subjects
Details
- ISSN :
- 1527974X
- Volume :
- 28
- Database :
- OpenAIRE
- Journal :
- Journal of the American Medical Informatics Association
- Accession number :
- edsair.doi.dedup.....b036018ff332ea08c3c7c6362407f52e
- Full Text :
- https://doi.org/10.1093/jamia/ocab048