Back to Search
Start Over
Using Random Forest Models to Identify Correlates of a Diabetic Peripheral Neuropathy Diagnosis from Electronic Health Record Data.
- Source :
- Pain Medicine; Jan2017, Vol. 18 Issue 1, p107-115, 9p, 2 Charts, 4 Graphs
- Publication Year :
- 2017
-
Abstract
- Objective. To identify variables correlated with a diagnosis of diabetic peripheral neuropathy (DPN) using random forest modeling applied to electronic health records. Design. Retrospective analysis. Setting. Humedica de-identified electronic health records database. Subjects. Subjects≥18 years old with type 2 diabetes from January 1, 2008-September 30, 2013 having continuous data for 1 year pre- and postindex with DPN (n535,050) and without DPN (n5288,328) were identified. Methods. Demographic, clinical, and health care resource utilization variables (e.g., inpatient and outpatient encounters, medications, and procedures) were input into a random forest model to identify the most important correlates of a DPN diagnosis. Random forest modeling is a computationally extensive, robust data mining technique that accommodates large sets of variables to identify associated factors using an ensemble of classifications trees. Accuracy of the model was evaluated using receiver operating characteristic curves (ROC). Results. The final random forest model consisted of the following variables (importance) associated with a DPN diagnosis: Charlson Comorbidity Index score (100%), age (37.1%), number of pre-index procedures and services (29.7%), number of pre-index outpatient prescriptions (24.2%), number of preindex outpatient visits (18.3%), number of pre-index laboratory visits (16.9%), number of pre-index outpatient office visits (12.1%), number of inpatient prescriptions (5.9%), and number of pain-related medication prescriptions (4.4%). ROC analysis confirmed model performance, with an area under the curve of 0.824 and accuracy of 89.6% (95% confidence interval 89.4%, 89.8%). Conclusions. Random forest modeling can determine likelihood of a DPN diagnosis. Further validation of the random forest model may help facilitate earlier diagnosis and enhance management strategies. [ABSTRACT FROM AUTHOR]
- Subjects :
- DIAGNOSIS of diabetic neuropathies
DIABETIC neuropathies
CONFIDENCE intervals
DATABASES
DECISION trees
PEOPLE with diabetes
ETHNIC groups
INCOME
MEDICAL information storage & retrieval systems
MEDICAL care use
MEDICAL prescriptions
TYPE 2 diabetes
NOSOLOGY
RESEARCH evaluation
RESEARCH funding
DATA mining
COMORBIDITY
BODY mass index
RETROSPECTIVE studies
SEVERITY of illness index
RECEIVER operating characteristic curves
ELECTRONIC health records
DISEASE risk factors
Subjects
Details
- Language :
- English
- ISSN :
- 15262375
- Volume :
- 18
- Issue :
- 1
- Database :
- Complementary Index
- Journal :
- Pain Medicine
- Publication Type :
- Academic Journal
- Accession number :
- 121261599
- Full Text :
- https://doi.org/10.1093/pm/pnw096