Back to Search Start Over

Data cleaning and harmonization of clinical trial data: Medication-assisted treatment for opioid use disorder.

Authors :
Raymond R Balise
Mei-Chen Hu
Anna R Calderon
Gabriel J Odom
Laura Brandt
Sean X Luo
Daniel J Feaster
Source :
PLoS ONE, Vol 19, Iss 11, p e0312695 (2024)
Publication Year :
2024
Publisher :
Public Library of Science (PLoS), 2024.

Abstract

Several large-scale, pragmatic clinical trials on opioid use disorder (OUD) have been completed in the National Drug Abuse Treatment Clinical Trials Network (CTN). However, the resulting data have not been harmonized between the studies to compare the patient characteristics. This paper provides lessons learned from a large-scale harmonization process that are critical for all biomedical researchers collecting new data and those tasked with combining datasets. We harmonized data from multiple domains from CTN-0027 (N = 1269), which compared methadone and buprenorphine at federally licensed methadone treatment programs; CTN-0030 (N = 653), which recruited patients who used predominantly prescription opioids and were treated with buprenorphine; and CTN-0051 (N = 570), which compared buprenorphine and extended-release naltrexone (XR-NTX) and recruited from inpatient treatment facilities. Patient-level data were harmonized and a total of 23 database tables, with meticulous documentation, covering more than 110 variables, along with three tables with "meta-data" about the study design and treatment arms, were created. Domains included: social and demographic characteristics, medical and psychiatric history, self-reported drug use details and urine drug screening results, withdrawal, and treatment drug details. Here, we summarize the numerous issues with the organization and fidelity of the publicly available data which were noted and resolved, and present results on patient characteristics across the three trials and the harmonized domains, respectively. A systematic harmonization of OUD clinical trial data can be accomplished, despite heterogeneous data coding and classification procedures, by standardizing commonly assessed characteristics. Similar methods, embracing database normalization and/or "tidy" data, should be used for future datasets in other substance use disorder clinical trials.

Subjects

Subjects :
Medicine
Science

Details

Language :
English
ISSN :
19326203
Volume :
19
Issue :
11
Database :
Directory of Open Access Journals
Journal :
PLoS ONE
Publication Type :
Academic Journal
Accession number :
edsdoj.2dde77f1a9714e1e8d5fb68938ca2e9a
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pone.0312695