1. Multiple Imputation of Race and Hispanic Ethnicity in National Surveillance Data for Chlamydia, Gonorrhea, and Syphilis.
- Author
-
Pondo T, Torrone E, and Pagaoa M
- Subjects
- Humans, United States epidemiology, Female, Male, Adult, Population Surveillance, Ethnicity statistics & numerical data, Adolescent, Young Adult, Racial Groups statistics & numerical data, Gonorrhea ethnology, Gonorrhea epidemiology, Syphilis epidemiology, Syphilis ethnology, Chlamydia Infections epidemiology, Chlamydia Infections ethnology, Hispanic or Latino statistics & numerical data
- Abstract
Background: Disease burden of sexually transmitted infections such as chlamydia, gonorrhea, and syphilis is often compared across age categories, sex categories, and race and ethnicity categories. Missing data may prevent researchers from accurately characterizing health disparities between populations. This article describes the methods used to impute race and Hispanic ethnicity in a large national surveillance data set., Methods: All US cases of chlamydia, gonorrhea, and syphilis (excluding congenital syphilis) reported through the National Notifiable Diseases Surveillance System from the year 2019 were included in the analyses. We used fully conditional specification to impute missing race and Hispanic ethnicity data. After imputation, reported case rates were calculated, by disease, for each race and Hispanic ethnicity category using Vintage 2019 Population and Housing Unit Estimates from the US Census. We then used case counts from subsets that contained only complete race and Hispanic ethnicity information to investigate if the confidence intervals from the multiply imputed data included the observed number of cases in each race and Hispanic ethnicity category., Results: Among the 2,553,038 cases reported in 2019, race and Hispanic ethnicity were multiply imputed for 9% of syphilis cases, 22% of gonorrhea cases, and 33% of chlamydia cases. In the subset analyses, every nonzero rate of reported cases was contained within the confidence intervals that were calculated from multiply imputed data., Conclusions: Confidence intervals that account for the uncertainty of the predictions are an advantage of multiple imputation over complete-case analysis because a realistic variance estimate allows for valid hypothesis testing results., Competing Interests: Conflict of interest and sources of funding: None declared., (Copyright © 2024 American Sexually Transmitted Diseases Association. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF