Back to Search Start Over

Using machine learning to extract information and predict outcomes from reports of randomised trials of smoking cessation interventions in the Human Behaviour-Change Project [version 2; peer review: 2 approved, 1 approved with reservations]

Authors :
Pol Mac Aonghusa
Alison J. Wright
Robert West
Janna Hastings
Yufang Hou
Alison O'Mara-Eves
Francesca Bonin
Martin Gleize
Susan Michie
Marie Johnston
James Thomas
Source :
Wellcome Open Research, Vol 8 (2024)
Publication Year :
2024
Publisher :
Wellcome, 2024.

Abstract

Background Using reports of randomised trials of smoking cessation interventions as a test case, this study aimed to develop and evaluate machine learning (ML) algorithms for extracting information from study reports and predicting outcomes as part of the Human Behaviour-Change Project. It is the first of two linked papers, with the second paper reporting on further development of a prediction system. Methods Researchers manually annotated 70 items of information (‘entities’) in 512 reports of randomised trials of smoking cessation interventions covering intervention content and delivery, population, setting, outcome and study methodology using the Behaviour Change Intervention Ontology. These entities were used to train ML algorithms to extract the information automatically. The information extraction ML algorithm involved a named-entity recognition system using the ‘FLAIR’ framework. The manually annotated intervention, population, setting and study entities were used to develop a deep-learning algorithm using multiple layers of long-short-term-memory (LSTM) components to predict smoking cessation outcomes. Results The F1 evaluation score, derived from the false positive and false negative rates (range 0–1), for the information extraction algorithm averaged 0.42 across different types of entity (SD=0.22, range 0.05–0.88) compared with an average human annotator’s score of 0.75 (SD=0.15, range 0.38–1.00). The algorithm for assigning entities to study arms (e.g., intervention or control) was not successful. This initial ML outcome prediction algorithm did not outperform prediction based just on the mean outcome value or a linear regression model. Conclusions While some success was achieved in using ML to extract information from reports of randomised trials of smoking cessation interventions, we identified major challenges that could be addressed by greater standardisation in the way that studies are reported. Outcome prediction from smoking cessation studies may benefit from development of novel algorithms, e.g., using ontological information to inform ML (as reported in the linked paper 1 ).

Details

Language :
English
ISSN :
2398502X
Volume :
8
Database :
Directory of Open Access Journals
Journal :
Wellcome Open Research
Publication Type :
Academic Journal
Accession number :
edsdoj.93d5a1a5b6d45d3974e8ee6b9e2a50d
Document Type :
article
Full Text :
https://doi.org/10.12688/wellcomeopenres.20000.2