93 results on '"Mwambi H"'
Search Results
2. Statistical Methods of Handling Ordinal Longitudinal Responses with Intermittent Missing Data
- Author
-
Aluko O. and Mwambi H.
- Subjects
Health (social science) ,Health Policy ,Public Health, Environmental and Occupational Health ,Medicine (miscellaneous) ,Health Professions (miscellaneous) - Published
- 2022
- Full Text
- View/download PDF
3. Diagnostics for a two-stage joint survival model.
- Author
-
Singini, I. L., Mwambi, H. G., and Gumedze, F. N.
- Subjects
- *
SURVIVAL analysis (Biometry) , *CLINICAL trials - Abstract
A two-stage joint survival model is used to analyze time to event outcomes that could be associated with biomakers that are repeatedly collected over time. A Two-stage joint survival model has limited model checking tools and is usually assessed using standard diagnostic tools for survival models. The diagnostic tools can be improved and implemented. Time-varying covariates in a two-stage joint survival model might contain outlying observations or subjects. In this study we used the variance shift outlier model (VSOM) to detect and down-weight outliers in the first stage of the two-stage joint survival model. This entails fitting a VSOM at the observation level and a VSOM at the subject level, and then fitting a combined VSOM for the identified outliers. The fitted values were then extracted from the combined VSOM which were then used as time-varying covariate in the extended Cox model. We illustrate this methodology on a dataset from a multi-center randomized clinical trial. A multi-center trial showed that a combined VSOM fits the data better than an extended Cox model. We noted that implementing a combined VSOM, when desired, has a better fit based on the fact that outliers are down-weighted. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Different methods for handling incomplete longitudinal binary outcome due to missing at random dropout
- Author
-
Satty, A., Mwambi, H., and Molenberghs, G.
- Published
- 2015
- Full Text
- View/download PDF
5. Statistical Methodologies for Dealing with Incomplete Longitudinal Outcomes Due to Dropout Missing at Random
- Author
-
Satty, A., primary, Mwambi, H., additional, and Molenberghs, G., additional
- Published
- 2017
- Full Text
- View/download PDF
6. Analysis of Missing Data in Progressed Learners: The Use of Multiple Imputation Methods.
- Author
-
Mabungane, S., Ramroop, S., and Mwambi, H.
- Subjects
MISSING data (Statistics) ,MULTIPLE imputation (Statistics) ,DATA analysis ,STATISTICS ,PANEL analysis ,EDUCATION research - Abstract
The issue of missing data raises concerns in all statistical and educational research. In this study, we focus on missing data in school-based assessment data generated by progressed high school learners (those who did not meet the promotional requirements for their current grades but were allowed to move to the next grade because of policy stipulations). There are a number of approaches available for handling missing data in statistical literature. Multiple imputation is an approach statisticians often recommend for dealing with missing data. We analysed a longitudinal dataset composed of progressed high school learners with missing data points, imputing the missing values with multiple imputation by chained equations and Amelia II. The missing data points in our dataset were due to learners not submitting their tasks, which then impacted negatively on their results. We found a higher proportion of missing values in mathematics/mathematical literacy, science and accounting. Our results showed that progressed learners struggle more with these subjects where knowledge develops cumulatively, and that their gaps in prior knowledge probably hinder them in understanding new concepts. Thus, the policy on progression of learners brings challenges to the already strained educational system and requires a more specialised system of support. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Diagnostics for a two-stage joint survival model
- Author
-
Singini, I. L., primary, Mwambi, H. G., additional, and Gumedze, F. N., additional
- Published
- 2021
- Full Text
- View/download PDF
8. Modelling and Analysis of the Intrinsic Dynamics of Cholera
- Author
-
Mukandavire, Z., Tripathi, A., Chiyaka, C., Musuka, G., Nyabadza, F., and Mwambi, H. G.
- Published
- 2011
- Full Text
- View/download PDF
9. A nonlinear mixed-effects model for multivariate longitudinal data with partially observed outcomes with application to HIV disease dynamics
- Author
-
Luwanda, A. G., primary and Mwambi, H. G., additional
- Published
- 2016
- Full Text
- View/download PDF
10. Multiple imputation for ordinal longitudinal data with monotone missing data patterns
- Author
-
Kombo, A.Y., primary, Mwambi, H., additional, and Molenberghs, G., additional
- Published
- 2016
- Full Text
- View/download PDF
11. Selection and pattern mixture models for modelling longitudinal data with dropout: An application study
- Author
-
Mwambi, H. and Satty, Ali
- Subjects
97K80 46N30 ,under-identification ,46 Associative rings and algebras::46N Miscellaneous applications of functional analysis [Classificació AMS] ,sensitivity analysis ,selection models ,Matemàtiques i estadística::Estadística matemàtica [Àrees temàtiques de la UPC] ,identifying restrictions ,identifying restrictions, under-identification, selection models, pattern mixture models, sensitivity analysis ,pattern mixture models ,97 Mathematics Education [Classificació AMS] - Abstract
Incomplete data are unavoidable in studies that involve data measured or observed longitudinally on individuals, regardless of how well they are designed. Dropout can potentially cause serious bias problems in the analysis of longitudinal data. In the presence of dropout, an appropriate strategy for analyzing such data would require the definition of a joint model for dropout and measurement processes. This paper is primarily concerned with selection and pattern mixture models as modelling frameworks that could be used for sensitivity analysis to jointly model the distribution for the dropout process and the longitudinal measurement process. We demonstrate the application of these models for handling dropout in longitudinal data where the dependent variable is missing across time. We restrict attention to the situation in which outcomes are continuous. The primary objectives are to investigate the potential influence that dropout might have or exert on the dependent measurement process based on the considered data as well as to deal with incomplete sequences. We apply the methods to a data set arising from a serum cholesterol study. The results obtained from these methods are then compared to help gain additional insight into the serum cholesterol data and assess sensitivity of the assumptions made. Results showed that additional confidence in the findings was gained as both models led to similar results when assessing significant effects, such as marginal treatment effects.
- Published
- 2013
12. A frequentist approach to estimating the force of infection and the recovery rate for a respiratory disease among infants in coastal Kenya
- Author
-
Mwambi, H., Ramroop, S., White, L.J., Okiro, E.A., Nokes, D.J., Shkedy, Ziv, and Molenberghs, Geert
- Abstract
This paper aims to develop a probability-based model involving the use of direct likelihood formulation and generalised linear modelling in order to estimate important disease parameters from real data. The force of infection and the recovery rate or per capita loss of infection are the parameters of interest. The problem of dealing with time-varying disease parameters is also addressed in the paper by fitting piecewise constant parameters over time. The findings of the current paper are comparable and similar to estimates from an independent approach suggested by White et al.21 that employed Bayesian MCMC modelling via WinBUGS. The authors gratefully acknowledge the financial support from The Wellcome Trust (Grant No. 061584), and the IUAP research network Nr. P5/24 of the Belgian Government (Belgian Science Policy). Shaun Ramroop would like to thank the NRF of South Africa for funding his PhD work (THUTHUKA-Researchers in training Ref. No: TTK2005081700004). Mahidol-Oxford Tropical Medicine Research Unit is funded by the Wellcome Trust of Great Britain. This article is published with the permission of the Director of KEMRI.
- Published
- 2011
13. Multiple correspondence analysis as a tool for analysis of large health surveys in African settings
- Author
-
Ayele, D, primary, Zewotir, T, additional, and Mwambi, H, additional
- Published
- 2015
- Full Text
- View/download PDF
14. Selection and pattern mixture models for modelling longitudinal data with dropout: An application study
- Author
-
Satty, Ali, Mwambi, H., Satty, Ali, and Mwambi, H.
- Abstract
Incomplete data are unavoidable in studies that involve data measured or observed longitudinally on individuals, regardless of how well they are designed. Dropout can potentially cause serious bias problems in the analysis of longitudinal data. In the presence of dropout, an appropriate strategy for analyzing such data would require the definition of a joint model for dropout and measurement processes. This paper is primarily concerned with selection and pattern mixture models as modelling frameworks that could be used for sensitivity analysis to jointly model the distribution for the dropout process and the longitudinal measurement process. We demonstrate the application of these models for handling dropout in longitudinal data where the dependent variable is missing across time. We restrict attention to the situation in which outcomes are continuous. The primary objectives are to investigate the potential influence that dropout might have or exert on the dependent measurement process based on the considered data as well as to deal with incomplete sequences. We apply the methods to a data set arising from a serum cholesterol study. The results obtained from these methods are then compared to help gain additional insight into the serum cholesterol data and assess sensitivity of the assumptions made. Results showed that additional confidence in the findings was gained as both models led to similar results when assessing significant effects, such as marginal treatment effects., Peer Reviewed
- Published
- 2014
15. A nonlinear mixed-effects model for multivariate longitudinal data with partially observed outcomes with application to HIV disease dynamics.
- Author
-
Luwanda, A. G. and Mwambi, H. G.
- Subjects
- *
HIV , *NONLINEAR statistical models , *STOCHASTIC approximation , *EXPECTATION-maximization algorithms , *MULTIVARIATE analysis , *HEALTH outcome assessment - Abstract
The measurable multiple bio-markers for a disease are used as indicators for studying the response variable of interest in order to monitor and model disease progression. However, it is common for subjects to drop out of the studies prematurely resulting in unbalanced data and hence complicating the inferences involving such data. In this paper we consider a case where data are unbalanced among subjects and also within a subject because for some reason only a subset of the multiple outcomes of the response variable are observed at any one occasion. We propose a nonlinear mixed-effects model for the multivariate response variable data and derive a joint likelihood function that takes into account the partial dropout of the outcomes of the response variable. We further show how the methodology can be used in the estimation of the parameters that characterise HIV disease dynamics. An approximation technique of the parameters is also given and illustrated using a routine observational HIV dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
16. Multiple imputation for ordinal longitudinal data with monotone missing data patterns.
- Author
-
Kombo, A.Y., Mwambi, H., and Molenberghs, G.
- Subjects
- *
MULTIPLE imputation (Statistics) , *DATA analysis , *MONOTONE operators , *SIMULATION methods & models , *SCIENCE databases - Abstract
Missing data often complicate the analysis of scientific data. Multiple imputation is a general purpose technique for analysis of datasets with missing values. The approach is applicable to a variety of missing data patterns but often complicated by some restrictions like the type of variables to be imputed and the mechanism underlying the missing data. In this paper, the authors compare the performance of two multiple imputation methods, namely fully conditional specification and multivariate normal imputation in the presence of ordinal outcomes with monotone missing data patterns. Through a simulation study and an empirical example, the authors show that the two methods are indeed comparable meaning any of the two may be used when faced with scenarios, at least, as the ones presented here. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
17. A simulation model of African Anopheles ecology and population dynamics for the analysis of malaria transmission
- Author
-
Depinay, J.M.O., Mbogo, C.M., Killeen, G., Knols, B.G.J., Beier, J., Carlson, J., Dushoff, J., Billingsley, P., Mwambi, H., Githure, J., Toure, A.M., and McKenzie, F.E.
- Subjects
density ,body-size ,western kenya ,temperature ,arabiensis ,PE&RC ,culicidae ,Laboratorium voor Entomologie ,habitats ,Laboratory of Entomology ,larval survival ,sensu-stricto diptera ,gambiae complex - Abstract
Background: Malaria is one of the oldest and deadliest infectious diseases in humans. Many mathematical models of malaria have been developed during the past century, and applied to potential interventions. However, malaria remains uncontrolled and is increasing in many areas, as are vector and parasite resistance to insecticides and drugs. Methods: This study presents a simulation model of African malaria vectors. This individual-based model incorporates current knowledge of the mechanisms underlying Anopheles population dynamics and their relations to the environment. One of its main strengths is that it is based on both biological and environmental variables. Results: The model made it possible to structure existing knowledge, assembled in a comprehensive review of the literature, and also pointed out important aspects of basic Anopheles biology about which knowledge is lacking. One simulation showed several patterns similar to those seen in the field, and made it possible to examine different analyses and hypotheses for these patterns; sensitivity analyses on temperature, moisture, predation and preliminary investigations of nutrient competition were also conducted. Conclusions: Although based on some mathematical formulae and parameters, this new tool has been developed in order to be as explicit as possible, transparent in use, close to reality and amenable to direct use by field workers. It allows a better understanding of the mechanisms underlying Anopheles population dynamics in general and also a better understanding of the dynamics in specific local geographic environments. It points out many important areas for new investigations that will be critical to effective, efficient, sustainable interventions.
- Published
- 2004
18. A Complex Survey Data Analysis of Tb Mortality in South Africa
- Author
-
Murorunkwere, JL, additional and Mwambi, H, additional
- Published
- 2013
- Full Text
- View/download PDF
19. Ticks and tick-borne diseases: a vector-host interaction model for the brown ear tick (Rhipicephalus appendiculatus).
- Author
-
Mwambi, H., Baumgärtner, J., Hadeler, K., Mwambi, H G, Baumgärtner, J, and Hadeler, K P
- Subjects
- *
MATHEMATICAL models , *TICK control , *RHIPICEPHALUS appendiculatus - Abstract
An analytical model is derived for the interaction of the brown ear tick (Rhipicephalus appendiculatus) with its hosts. Such models are rare due to the complexity and lack of information on the entire stages of ticks life cycles. Most models are simulations rather than analytical. The vector is categorized into a discrete number of compartments according to its life cycle. The starting model in this article consists of a system of differential equations with constant coefficients. A general model on a stage structured population with unlimited host density is developed. From the characteristic polynomial of the system a sensitivity analysis for the population parameters is carried out in detail. The model is then improved by incorporating host abundance and availability. This is done on the basis of a demand-driven and ratio-dependent functional response model. The improved model adequately represents the dynamics of a stage-structured vector population under conditions of varying host density. The model allows the qualitative evaluation of several management strategies and is expected to guide future research work. [ABSTRACT FROM AUTHOR]
- Published
- 2000
- Full Text
- View/download PDF
20. A frequentist approach to estimating the force of infection for a respiratory disease using repeated measurement data from a birth cohort
- Author
-
Mwambi, H, primary, Ramroop, S, additional, White, LJ, additional, Okiro, EA, additional, Nokes, DJ, additional, Shkedy, Z, additional, and Molenberghs, G, additional
- Published
- 2011
- Full Text
- View/download PDF
21. Ticks and tick-borne diseases in Africa: a disease transmission model
- Author
-
Mwambi, H. G., primary
- Published
- 2002
- Full Text
- View/download PDF
22. Incorporating frailty effects in the cox proportional hazards model using two independent methods in two independent data sets
- Author
-
Belinda Phipson and Mwambi, H.
23. Prevalence and risk factors of malaria in Ethiopia
- Author
-
Ayele Dawit G, Zewotir Temesgen T, and Mwambi Henry G
- Subjects
Generalized linear model ,Odds ratio ,Rapid diagnosis test ,Risk factors ,Survey design ,Arctic medicine. Tropical medicine ,RC955-962 ,Infectious and parasitic diseases ,RC109-216 - Abstract
Abstract Background More than 75% of the total area of Ethiopia is malarious, making malaria the leading public health problem in Ethiopia. The aim of this study was to investigate the prevalence rate and the associated socio-economic, geographic and demographic factors of malaria based on the rapid diagnosis test (RDT) survey results. Methods From December 2006 to January 2007, a baseline malaria indicator survey in Amhara, Oromiya and Southern Nation Nationalities and People (SNNP) regions of Ethiopia was conducted by The Carter Center. This study uses this data. The method of generalized linear model was used to analyse the data and the response variable was the presence or absence of malaria using the rapid diagnosis test (RDT). Results The analyses show that the RDT result was significantly associated with age and gender. Other significant covariates confounding variables are source of water, trip to obtain water, toilet facility, total number of rooms, material used for walls, and material used for roofing. The prevalence of malaria for households with clean water found to be less. Malaria rapid diagnosis found to be higher for thatch and stick/mud roof and earth/local dung plaster floor. Moreover, spraying anti-malaria to the house was found to be one means of reducing the risk of malaria. Furthermore, the housing condition, source of water and its distance, gender, and ages in the households were identified in order to have two-way interaction effects. Conclusion Individuals with poor socio-economic conditions are positively associated with malaria infection. Improving the housing condition of the household is one of the means of reducing the risk of malaria. Children and female household members are the most vulnerable to the risk of malaria. Such information is essential to design improved strategic intervention for the reduction of malaria epidemic in Ethiopia.
- Published
- 2012
- Full Text
- View/download PDF
24. A simulation model of African Anopheles ecology and population dynamics for the analysis of malaria transmission
- Author
-
Billingsley Peter, Dushoff Jonathan, Carlson John, Beier John, Knols Bart, Killeen Gerry, Mbogo Charles M, Depinay Jean-Marc O, Mwambi Henry, Githure John, Toure Abdoulaye M, and Ellis McKenzie F
- Subjects
Arctic medicine. Tropical medicine ,RC955-962 ,Infectious and parasitic diseases ,RC109-216 - Abstract
Abstract Background Malaria is one of the oldest and deadliest infectious diseases in humans. Many mathematical models of malaria have been developed during the past century, and applied to potential interventions. However, malaria remains uncontrolled and is increasing in many areas, as are vector and parasite resistance to insecticides and drugs. Methods This study presents a simulation model of African malaria vectors. This individual-based model incorporates current knowledge of the mechanisms underlying Anopheles population dynamics and their relations to the environment. One of its main strengths is that it is based on both biological and environmental variables. Results The model made it possible to structure existing knowledge, assembled in a comprehensive review of the literature, and also pointed out important aspects of basic Anopheles biology about which knowledge is lacking. One simulation showed several patterns similar to those seen in the field, and made it possible to examine different analyses and hypotheses for these patterns; sensitivity analyses on temperature, moisture, predation and preliminary investigations of nutrient competition were also conducted. Conclusions Although based on some mathematical formulae and parameters, this new tool has been developed in order to be as explicit as possible, transparent in use, close to reality and amenable to direct use by field workers. It allows a better understanding of the mechanisms underlying Anopheles population dynamics in general and also a better understanding of the dynamics in specific local geographic environments. It points out many important areas for new investigations that will be critical to effective, efficient, sustainable interventions.
- Published
- 2004
- Full Text
- View/download PDF
25. Factors of acute respiratory infection among under-five children across sub-Saharan African countries using machine learning approaches.
- Author
-
Fenta HM, Zewotir TT, Naidoo S, Naidoo RN, and Mwambi H
- Subjects
- Humans, Child, Preschool, Africa South of the Sahara epidemiology, Infant, Female, Male, Particulate Matter analysis, Acute Disease, Air Pollution adverse effects, Infant, Newborn, Respiratory Tract Infections epidemiology, Machine Learning
- Abstract
Symptoms of Acute Respiratory infections (ARIs) among under-five children are a global health challenge. We aimed to train and evaluate ten machine learning (ML) classification approaches in predicting symptoms of ARIs reported by mothers among children younger than 5 years in sub-Saharan African (sSA) countries. We used the most recent (2012-2022) nationally representative Demographic and Health Surveys data of 33 sSA countries. The air pollution covariates such as global annual surface particulate matter (PM 2.5) and the nitrogen dioxide available in the form of raster images were obtained from the National Aeronautics and Space Administration (NASA). The MLA was used for predicting the symptoms of ARIs among under-five children. We randomly split the dataset into two, 80% was used to train the model, and the remaining 20% was used to test the trained model. Model performance was evaluated using sensitivity, specificity, accuracy, and the area under the receiver operating characteristic curve. A total of 327,507 under-five children were included in the study. About 7.10, 4.19, 20.61, and 21.02% of children reported symptoms of ARI, Severe ARI, cough, and fever in the 2 weeks preceding the survey years respectively. The prevalence of ARI was highest in Mozambique (15.3%), Uganda (15.05%), Togo (14.27%), and Namibia (13.65%,), whereas Uganda (40.10%), Burundi (38.18%), Zimbabwe (36.95%), and Namibia (31.2%) had the highest prevalence of cough. The results of the random forest plot revealed that spatial locations (longitude, latitude), particulate matter, land surface temperature, nitrogen dioxide, and the number of cattle in the houses are the most important features in predicting the diagnosis of symptoms of ARIs among under-five children in sSA. The RF algorithm was selected as the best ML model (AUC = 0.77, Accuracy = 0.72) to predict the symptoms of ARIs among children under five. The MLA performed well in predicting the symptoms of ARIs and associated predictors among under-five children across the sSA countries. Random forest MLA was identified as the best classifier to be employed for the prediction of the symptoms of ARI among under-five children., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
26. The future of public health doctoral education in Africa: transforming higher education institutions to enhance research and practice.
- Author
-
Bukenya J, Kebede D, Mwambi H, Pate M, Adongo P, Berhane Y, Canavan CR, Chirwa T, Fawole OI, Guwatudde D, Jackson E, Madzorera I, Moshabela M, Oduola AMJ, Sunguya B, Sall A, Raji T, and Fawzi W
- Subjects
- Humans, Africa, Universities organization & administration, Education, Public Health Professional organization & administration, Education, Graduate organization & administration, Public Health education
- Abstract
The African Union and the Africa Centers for Disease Control and Prevention issued a Call to Action in 2022 for Africa's New Public Health Order that underscored the need for increased capacity in the public health workforce. Additional domestic and global investments in public health workforce development are central to achieving the aspirations of Agenda 2063 of the African Union, which aims to build and accelerate the implementation of continental frameworks for equitable, people-centred growth and development. Recognising the crucial role of higher education and research, we assessed the capabilities of public health doctoral training in schools and programmes of public health in Africa across three conceptual components: instructional, institutional, and external. Six inter-related and actionable recommendations were derived to advance doctoral training, research, and practice capacity within and between universities. These can be achieved through equitable partnerships between universities, research centres, and national, regional, and global public health institutions., Competing Interests: Declaration of interests We declare no competing interests., (Copyright © 2024 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.)
- Published
- 2024
- Full Text
- View/download PDF
27. Investigating the effects of cytokine biomarkers on HIV incidence: a case study for individuals randomized to pre-exposure prophylaxis vs. control.
- Author
-
Ogutu S, Mohammed M, and Mwambi H
- Subjects
- Humans, Incidence, Male, Female, Adult, Proportional Hazards Models, Anti-HIV Agents therapeutic use, Anti-HIV Agents administration & dosage, HIV Infections prevention & control, HIV Infections epidemiology, Cytokines blood, Pre-Exposure Prophylaxis statistics & numerical data, Biomarkers blood
- Abstract
Introduction: Understanding and identifying the immunological markers and clinical information linked with HIV acquisition is crucial for effectively implementing Pre-Exposure Prophylaxis (PrEP) to prevent HIV acquisition. Prior analysis on HIV incidence outcomes have predominantly employed proportional hazards (PH) models, adjusting solely for baseline covariates. Therefore, models that integrate cytokine biomarkers, particularly as time-varying covariates, are sorely needed., Methods: We built a simple model using the Cox PH to investigate the impact of specific cytokine profiles in predicting the overall HIV incidence. Further, Kaplan-Meier curves were used to compare HIV incidence rates between the treatment and placebo groups while assessing the overall treatment effectiveness. Utilizing stepwise regression, we developed a series of Cox PH models to analyze 48 longitudinally measured cytokine profiles. We considered three kinds of effects in the cytokine profile measurements: average, difference, and time-dependent covariate. These effects were combined with baseline covariates to explore their influence on predictors of HIV incidence., Results: Comparing the predictive performance of the Cox PH models developed using the AIC metric, model 4 (Cox PH model with time-dependent cytokine) outperformed the others. The results indicated that the cytokines, interleukin (IL-2, IL-3, IL-5, IL-10, IL-16, IL-12P70, and IL-17 alpha), stem cell factor (SCF), beta nerve growth factor (B-NGF), tumor necrosis factor alpha (TNF-A), interferon (IFN) alpha-2, serum stem cell growth factor (SCG)-beta, platelet-derived growth factor (PDGF)-BB, granulocyte macrophage colony-stimulating factor (GM-CSF), tumor necrosis factor-related apoptosis-inducing ligand (TRAIL), and cutaneous T-cell-attracting chemokine (CTACK) were significantly associated with HIV incidence. Baseline predictors significantly associated with HIV incidence when considering cytokine effects included: age of oldest sex partner, age at enrollment, salary, years with a stable partner, sex partner having any other sex partner, husband's income, other income source, age at debut, years lived in Durban, and sex in the last 30 days., Discussion: Overall, the inclusion of cytokine effects enhanced the predictive performance of the models, and the PrEP group exhibited reduced HIV incidences compared to the placebo group., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2024 Ogutu, Mohammed and Mwambi.)
- Published
- 2024
- Full Text
- View/download PDF
28. Mapping regional variability of exclusive breastfeeding and its determinants at different infant's age in Tanzania.
- Author
-
Jahanpour OF, Okango EL, Todd J, Mwambi H, and Mahande MJ
- Subjects
- Infant, Humans, Female, Tanzania, Cross-Sectional Studies, Literacy, Breast Feeding, Mothers
- Abstract
Introduction: Despite its numerous benefits, exclusive breastfeeding (EBF) remains an underutilized practice. Enhancing EBF uptake necessitates a focused approach targeting regions where its adoption is suboptimal. This study aimed to investigate regional disparities in EBF practices and identify determinants of EBF among infants aged 0-1, 2-3, and 4-5 months in Tanzania., Methods: This cross-sectional study utilized data from the 2015/16 Tanzania Demographic and Health Survey. A total of 1,015 infants aged 0-5 met the inclusion criteria, comprising 378 aged 0-1 month, 334 at 2-3 months, and 303 at 4-5 months. EBF practices were assessed using a 24-hour recall method. A generalized linear mixed model, with fixed covariates encompassing infant and maternal attributes and clusters for enumeration areas (EAs) and regions, was employed to estimate EBF proportions., Results: Regional disparities in EBF were evident among infants aged 0-1, 2-3, and 4-5 months, with decline in EBF proportions as an infant's age increases. This pattern was observed nationwide. Regional and EA factors influenced the EBF practices at 0-1 and 2-3 months, accounting for 17-40% of the variability at the regional level and 40-63% at the EA level. Literacy level among mothers had a significant impact on EBF practices at 2-3 months (e.g., women who could read whole sentences; AOR = 3.2, 95% CI 1.1,8.8)., Conclusion: Regional disparities in EBF proportions exist in Tanzania, and further studies are needed to understand their underlying causes. Targeted interventions should prioritize regions with lower EBF proportions. This study highlights the clustering of EBF practices at 0-1 and 2-3 months on both regional and EA levels. Conducting studies in smaller geographical areas may enhance our understanding of the enablers and barriers to EBF and guide interventions to promote recommended EBF practices., (© 2023. The Author(s).)
- Published
- 2023
- Full Text
- View/download PDF
29. Determining classes of food items for health requirements and nutrition guidelines using Gaussian mixture models.
- Author
-
Balakrishna Y, Manda S, Mwambi H, and van Graan A
- Abstract
Introduction: The identification of classes of nutritionally similar food items is important for creating food exchange lists to meet health requirements and for informing nutrition guidelines and campaigns. Cluster analysis methods can assign food items into classes based on the similarity in their nutrient contents. Finite mixture models use probabilistic classification with the advantage of taking into account the uncertainty of class thresholds., Methods: This paper uses univariate Gaussian mixture models to determine the probabilistic classification of food items in the South African Food Composition Database (SAFCDB) based on nutrient content., Results: Classifying food items by animal protein, fatty acid, available carbohydrate, total fibre, sodium, iron, vitamin A, thiamin and riboflavin contents produced data-driven classes with differing means and estimates of variability and could be clearly ranked on a low to high nutrient contents scale. Classifying food items by their sodium content resulted in five classes with the class means ranging from 1.57 to 706.27 mg per 100 g. Four classes were identified based on available carbohydrate content with the highest carbohydrate class having a mean content of 59.15 g per 100 g. Food items clustered into two classes when examining their fatty acid content. Foods with a high iron content had a mean of 1.46 mg per 100 g and was one of three classes identified for iron. Classes containing nutrient-rich food items that exhibited extreme nutrient values were also identified for several vitamins and minerals., Discussion: The overlap between classes was evident and supports the use of probabilistic classification methods. Food items in each of the identified classes were comparable to allowed food lists developed for therapeutic diets. This data-driven ranking of nutritionally similar classes could be considered for diet planning for medical conditions and individuals with dietary restrictions., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 Balakrishna, Manda, Mwambi and van Graan.)
- Published
- 2023
- Full Text
- View/download PDF
30. Trends of Exclusive Breastfeeding Practices and Its Determinants in Tanzania from 1999 to 2016.
- Author
-
Jahanpour OF, Todd J, Mwambi H, Okango EL, and Mahande MJ
- Subjects
- Infant, Humans, Female, Tanzania, Surveys and Questionnaires, Social Class, Breast Feeding, Mothers
- Abstract
Introduction : The benefits of exclusive breastfeeding (EBF) are widely reported. However, it is crucial to examine potential disparities in EBF practices across different regions of a country. Our study uses Tanzania demographic and health survey data to report on the trends of EBF across regions from 1999 to 2016, the patterns of the practice based on geographical location and socioeconomic status, and explores its determinants across the years. Methods : Descriptive statistics were used to establish the trends of EBF by geographical location and wealth quintile. A generalized linear mixed model was developed to incorporate both infant and maternal attributes as fixed covariates while considering enumeration areas and regions as clusters. The fitted model facilitated the estimation of EBF proportions at a regional level and identified key determinants influencing EBF practices across the survey periods. Moreover, we designed breastfeeding maps, visually depicting the performance of different regions throughout the surveys. Results : Across the various survey rounds, a notable regional variation in EBF practices was observed, with coastal regions generally exhibiting lower adherence to the practice. There was a linear trend between EBF and geographical residence ( p < 0.05) and socioeconomic standing ( p < 0.05) across the survey periods. Rural-dwelling women and those from the least affluent backgrounds consistently showcased a higher proportion of EBF. The prevalence of EBF declined as infants aged ( p < 0.001), a trend consistent across all survey waves. The associations between maternal attributes and EBF practices displayed temporal variations. Furthermore, a correlation between exclusive breastfeeding and attributes linked to both regional disparities and enumeration areas was observed. The intra-cluster correlation ranged from 18% to 41.5% at the regional level and from 40% to 58.5% at the enumeration area level. Conclusions : While Tanzania's progress in EBF practices is laudable, regional disparities persist, demanding targeted interventions. Sustaining achievements while addressing wealth-based disparities and the decline in EBF with infant age is vital. The study highlights the need for broad national strategies and localized investigations to understand and enhance EBF practices across different regions and socioeconomic contexts.
- Published
- 2023
- Full Text
- View/download PDF
31. Multiple imputation using chained equations for missing data in survival models: applied to multidrug-resistant tuberculosis and HIV data.
- Author
-
Mbona SV, Ndlovu P, Mwambi H, and Ramroop S
- Abstract
Background: Missing data are a prevalent problem in almost all types of data analyses, such as survival data analysis., Objective: To evaluate the performance of multivariable imputation via chained equations in determining the factors that affect the survival of multidrug-resistant-tuberculosis (MDR-TB) and HIV-coinfected patients in KwaZulu-Natal., Materials and Methods: Secondary data from 1542 multidrug-resistant tuberculosis patients were used in this study. First, data from patients with some missing observations were deleted from the original data set to obtain the complete case (CC) data set. Second, missing observations in the original data set were imputed 15 times to obtain complete data sets using a multivariable imputation case (MIC). The Cox regression model was fitted to both the CC and MIC data, and the results were compared using the model goodness of fit criteria [likelihood ratio tests, Akaike information criterion (AIC), and Bayesian Information Criterion (BIC)]., Results: The Cox regression model fitted the MIC data set better (likelihood ratio test statistic =76.88 on 10 df with P<0.01, AIC =1040.90, and BIC =1099.65) than the CC data set (likelihood ratio test statistic =42.68 on 10 df with P<0.01, AIC =1186.05 and BIC =1228.47). Variables that were insignificant when the model was fitted to the CC data set became significant when the model was fitted to the MIC data set., Conclusion: Correcting missing data using multiple imputation techniques for the MDR-TB problem is recommended. This approach led to better estimates and more power in the model., Competing Interests: Conflict of interest: the authors declare no potential conflict of interest., (Copyright © 2023, the Author(s).)
- Published
- 2023
- Full Text
- View/download PDF
32. A multivariate joint model to adjust for random measurement error while handling skewness and correlation in dietary data in an epidemiologic study of mortality.
- Author
-
Agogo GO, Muchene L, Orindi B, Murphy TE, Mwambi H, and Allore HG
- Subjects
- Humans, Nutrition Surveys, Proportional Hazards Models, Epidemiologic Studies, Diet adverse effects, Eating
- Abstract
Purpose: A substantial proportion of global deaths is attributed to unhealthy diets, which can be assessed at baseline or longitudinally. We demonstrated how to simultaneously correct for random measurement error, correlations, and skewness in the estimation of associations between dietary intake and all-cause mortality., Methods: We applied a multivariate joint model (MJM) that simultaneously corrected for random measurement error, skewness, and correlation among longitudinally measured intake levels of cholesterol, total fat, dietary fiber, and energy with all-cause mortality using US National Health and Nutrition Examination Survey linked to the National Death Index mortality data. We compared MJM with the mean method that assessed intake levels as the mean of a person's intake., Results: The estimates from MJM were larger than those from the mean method. For instance, the logarithm of hazard ratio for dietary fiber intake increased by 14 times (from -0.04 to -0.60) with the MJM method. This translated into a relative hazard of death of 0.55 (95% credible interval: 0.45, 0.65) with the MJM and 0.96 (95% credible interval: 0.95, 0.97) with the mean method., Conclusions: MJM adjusts for random measurement error and flexibly addresses correlations and skewness among longitudinal measures of dietary intake when estimating their associations with death., Competing Interests: Declaration of Competing Interest All the authors have no conflict of interest to declare for this research., (Copyright © 2023 Elsevier Inc. All rights reserved.)
- Published
- 2023
- Full Text
- View/download PDF
33. Modeling of correlated cognitive function and functional disability outcomes with bounded and missing data in a longitudinal aging study.
- Author
-
Agogo GO, Mwambi H, Shi X, and Liu Z
- Subjects
- Humans, Female, Aged, Longitudinal Studies, Cognition
- Abstract
Longitudinal studies of correlated cognitive and disability outcomes among older adults are characterized by missing data due to death or loss to follow-up from deteriorating health conditions. The Mini-Mental State Examination (MMSE) score for assessing cognitive function ranges from a minimum of 0 (floor) to a maximum of 30 (ceiling). To study the risk factors of cognitive function and functional disability, we propose a shared parameter model to handle missingness, correlation between outcomes, and the floor and ceiling effects of the MMSE measurements. The shared random effects in the proposed model handle missingness (either missing at random or missing not at random) and correlation between these outcomes, while the Tobit distribution handles the floor and ceiling effects of the MMSE measurements. We used data from the Chinese Longitudinal Healthy Longevity Survey (CLHLS) and a simulation study. By ignoring the MMSE floor and ceiling effects in the analyses of the CLHLS, the association of systolic blood pressure with cognitive function was not significant and the association of age with cognitive function was lower by 16.6% (from -6.237 to -5.201). By ignoring the MMSE floor and ceiling effects in the simulation study, the relative bias in the estimated association of female gender with cognitive function was 43 times higher (from -0.01 to -0.44). The estimated associations obtained with data missing at random were smaller than those with data missing not at random, demonstrating how the missing data mechanism affects the analytic results. Our work underscores the importance of proper model specification in longitudinal analysis of correlated outcomes subject to missingness and bounded values., (© 2022. The Psychonomic Society, Inc.)
- Published
- 2022
- Full Text
- View/download PDF
34. Bayesian spatio-temporal modelling and mapping of malaria and anaemia among children between 0 and 59 months in Nigeria.
- Author
-
Ibeji JU, Mwambi H, and Iddrisu AK
- Subjects
- Child, Humans, Bayes Theorem, Spatio-Temporal Analysis, Models, Statistical, Nigeria, Risk Factors, Malaria, Anemia
- Abstract
Background/m&m: A vital aspect of disease management and policy making lies in the understanding of the universal distribution of diseases. Nevertheless, due to differences all-over host groups and space-time outbreak activities, data are subject to intricacies. Herein, Bayesian spatio-temporal models were proposed to model and map malaria and anaemia risk ratio in space and time as well as to ascertain risk factors related to these diseases and the most endemic states in Nigeria. Parameter estimation was performed by employing the R-integrated nested Laplace approximation (INLA) package and Deviance Information Criteria were applied to select the best model., Results: In malaria, model 7 which basically suggests that previous trend of an event cannot account for future trend i.e., Interaction with one random time effect (random walk) has the least deviance. On the other hand, model 6 assumes that previous event can be used to predict future event i.e., (Interaction with one random time effect (ar1)) gave the least deviance in anaemia., Discussion: For malaria and anaemia, models 7 and 6 were selected to model and map these diseases in Nigeria, because these models have the capacity to receive strength from adjacent states, in a manner that neighbouring states have the same risk. Changes in risk and clustering with a high record of these diseases among states in Nigeria was observed. However, despite these changes, the total risk of malaria and anaemia for 2010 and 2015 was unaffected., Conclusion: Notwithstanding the methods applied, this study will be valuable to the advancement of a spatio-temporal approach for analyzing malaria and anaemia risk in Nigeria., (© 2022. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
35. Role of clusters in exclusive breastfeeding practices in Tanzania: A secondary analysis study using demographic and health survey data (2015/2016).
- Author
-
Jahanpour OF, Okango EL, Todd J, Mwambi H, and Mahande MJ
- Abstract
Background: While the benefits of exclusive breastfeeding are widely acknowledged, it continues to be a rare practice. Determinants of exclusive breastfeeding in Tanzania have been studied; however, the existence and contribution of regional variability to the practice have not been explored., Methods: Tanzania demographic and health survey data for 2015/2016 were used. Information on infants aged up to 6 months was abstracted. Exclusive breastfeeding was defined using a recall of feeding practices in the past 24 h. Enumeration areas and regions were treated as random effects. Models without random effects were compared with those that incorporated random effects using the Akaike information criterion. The determinants of exclusive breastfeeding were estimated using the generalized linear mixed model with enumeration areas nested within the region., Results: The generalized linear mixed model with an enumeration area nested within a region performed better than other models. The intra-cluster variability at region and enumeration area levels was 3.7 and 24.5%, respectively. The odds of practicing exclusive breastfeeding were lower for older and male infants, for mothers younger than 18, among mothers residing in urban areas, among those who were employed by a family member or someone else, those not assisted by a nurse/midwife, and those who were not counseled on exclusive breastfeeding within 2 days post-delivery. There was no statistical evidence of an association between exclusive breastfeeding practices and the frequency of listening to the radio and watching television. When mapping the proportion of exclusive breastfeeding, a variability of the practice is seen across regions., Conclusion: There is room to improve the proportion of those who practice exclusive breastfeeding in Tanzania. Beyond individual and setting factors, this analysis shows that a quarter of the variability in exclusive breastfeeding practices is at the community level. Further studies may explore the causes of variabilities in regional and enumeration area and how it operates. Interventions to protect, promote, and support exclusive breastfeeding in Tanzania may target the environment that shapes the attitude toward exclusive breastfeeding in smaller geographical areas., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2022 Jahanpour, Okango, Todd, Mwambi and Mahande.)
- Published
- 2022
- Full Text
- View/download PDF
36. Spatial variation and risk factors of malaria and anaemia among children aged 0 to 59 months: a cross-sectional study of 2010 and 2015 datasets.
- Author
-
Ibeji JU, Mwambi H, and Iddrisu AK
- Subjects
- Bayes Theorem, Child, Cross-Sectional Studies, Humans, Nigeria epidemiology, Prevalence, Risk Factors, Anemia etiology, Malaria complications, Malaria epidemiology
- Abstract
Malaria and anaemia are common diseases that affect children, particularly in Africa. Studies on the risk associated with these diseases and their synergy are scanty. This work aims to study the spatial pattern of malaria and anaemia in Nigeria and adjust for their risk factors using separate models for malaria and anaemia. This study used Bayesian spatial models within the Integrated Nested Laplace Approach (INLA) to establish the relationship between malaria and anaemia. We also adjust for risk factors of malaria and anaemia and map the estimated relative risks of these diseases to identify regions with a relatively high risk of the diseases under consideration. We used data obtained from the Nigeria malaria indicator survey (NMIS) of 2010 and 2015. The spatial variability distribution of both diseases was investigated using the convolution model, Conditional Auto-Regressive (CAR) model, generalized linear mixed model (GLMM) and generalized linear model (GLM) for each year. The convolution and generalized linear mixed models (GLMM) showed the least Deviance Information Criteria (DIC) in 2010 for malaria and anaemia, respectively. The Conditional Auto-Regressive (CAR) and convolution models had the least DIC in 2015 for malaria and anaemia, respectively. This study revealed that children in rural areas had strong and significant odds of malaria and anaemia infection [2010; malaria: AOR = 1.348, 95% CI = (1.117, 1.627), anaemia: AOR = 1.455, 95% CI = (1.201, 1.7623). 2015; malaria: AOR = 1.889, 95% CI = (1.568, 2.277), anaemia: AOR = 1.440, 95% CI = (1.205, 1.719)]. Controlling the prevalence of malaria and anaemia in Nigeria requires the identification of a child's location and proper confrontation of some socio-economic factors which may lead to the reduction of childhood malaria and anaemia infection., (© 2022. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
37. Statistical Methods for the Analysis of Food Composition Databases: A Review.
- Author
-
Balakrishna Y, Manda S, Mwambi H, and van Graan A
- Subjects
- Cluster Analysis, Databases, Factual, Food, Food Analysis, Nutrients, Nutrition Policy
- Abstract
Evidence-based knowledge of the relationship between foods and nutrients is needed to inform dietary-based guidelines and policy. Proper and tailored statistical methods to analyse food composition databases (FCDBs) could assist in this regard. This review aims to collate the existing literature that used any statistical method to analyse FCDBs, to identify key trends and research gaps. The search strategy yielded 4238 references from electronic databases of which 24 fulfilled our inclusion criteria. Information on the objectives, statistical methods, and results was extracted. Statistical methods were mostly applied to group similar food items (37.5%). Other aims and objectives included determining associations between the nutrient content and known food characteristics (25.0%), determining nutrient co-occurrence (20.8%), evaluating nutrient changes over time (16.7%), and addressing the accuracy and completeness of databases (16.7%). Standard statistical tests (33.3%) were the most utilised followed by clustering (29.1%), other methods (16.7%), regression methods (12.5%), and dimension reduction techniques (8.3%). Nutrient data has unique characteristics such as correlated components, natural groupings, and a compositional nature. Statistical methods used for analysis need to account for this data structure. Our summary of the literature provides a reference for researchers looking to expand into this area.
- Published
- 2022
- Full Text
- View/download PDF
38. Use of a deep learning and random forest approach to track changes in the predictive nature of socioeconomic drivers of under-5 mortality rates in sub-Saharan Africa.
- Author
-
Nasejje JB, Mbuvha R, and Mwambi H
- Subjects
- Child, Child Mortality, Cross-Sectional Studies, Ghana epidemiology, Humans, Socioeconomic Factors, Deep Learning
- Abstract
Objectives: We used machine learning algorithms to track how the ranks of importance and the survival outcome of four socioeconomic determinants (place of residence, mother's level of education, wealth index and sex of the child) of under-5 mortality rate (U5MR) in sub-Saharan Africa have evolved., Settings: This work consists of multiple cross-sectional studies. We analysed data from the Demographic Health Surveys (DHS) collected from four countries; Uganda, Zimbabwe, Chad and Ghana, each randomly selected from the four subregions of sub-Saharan Africa., Participants: Each country has multiple DHS datasets and a total of 11 datasets were selected for analysis. A total of n=85 688 children were drawn from the eleven datasets., Primary and Secondary Outcomes: The primary outcome variable is U5MR; the secondary outcomes were to obtain the ranks of importance of the four socioeconomic factors over time and to compare the two machine learning models, the random survival forest (RSF) and the deep survival neural network (DeepSurv) in predicting U5MR., Results: Mother's education level ranked first in five datasets. Wealth index ranked first in three, place of residence ranked first in two and sex of the child ranked last in most of the datasets. The four factors showed a favourable survival outcome over time, confirming that past interventions targeting these factors are yielding positive results. The DeepSurv model has a higher predictive performance with mean concordance indexes (between 67% and 80%), above 50% compared with the RSF model., Conclusions: The study reveals that children under the age of 5 in sub-Saharan Africa have favourable survival outcomes associated with the four socioeconomic factors over time. It also shows that deep survival neural network models are efficient in predicting U5MR and should, therefore, be used in the big data era to draft evidence-based policies to achieve the third sustainable development goal., Competing Interests: Competing interests: None declared., (© Author(s) (or their employer(s)) 2022. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.)
- Published
- 2022
- Full Text
- View/download PDF
39. Dictionary Based Global Twitter Sentiment Analysis of Coronavirus (COVID-19) Effects and Response.
- Author
-
Okango E and Mwambi H
- Abstract
In December 2019, a new pandemic called the coronavirus began ravaging the world. By May 2020, the pandemic had caused great loss of lives and disrupted the way of lives in more ways than one. The nature of the disease saw several strategies to curb its spread rolled out. These strategies included closing of businesses and borders, restriction of movements and working from home, mask mandate among others. With these measures and the effects, many individuals have taken to the social media to express their frustrations, opinions and how the pandemic is affecting them. This study employs dictionary based method for sentiment polarization from tweets related to coronavirus posted on Twitter. We also examine the co-occurrence of words to gain insights on the aspects affecting the masses. The results showed that mental health issues, lack of supplies were some of the direct effects of the pandemic. It was also clear that the COVID-19 prevention guidelines were well understood by those who tweeted. The results from this study may help governments combat the consequences of COVID-19 like mental health issues, lack of supplies e.g. food and also gauge the effectiveness or the reach of their guidelines., Competing Interests: Conflict of interestThe author declares that no conflict of interest exists., (© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021.)
- Published
- 2022
- Full Text
- View/download PDF
40. Predictors of colorectal cancer survival using cox regression and random survival forests models based on gene expression data.
- Author
-
Mohammed M, Mboya IB, Mwambi H, Elbashir MK, and Omolo B
- Subjects
- Aged, Female, Humans, Male, Middle Aged, Risk Factors, Survival Analysis, Biomarkers, Tumor genetics, Colorectal Neoplasms genetics, Colorectal Neoplasms mortality, Gene Expression Regulation, Neoplastic
- Abstract
Understanding and identifying the markers and clinical information that are associated with colorectal cancer (CRC) patient survival is needed for early detection and diagnosis. In this work, we aimed to build a simple model using Cox proportional hazards (PH) and random survival forest (RSF) and find a robust signature for predicting CRC overall survival. We used stepwise regression to develop Cox PH model to analyse 54 common differentially expressed genes from three mutations. RSF is applied using log-rank and log-rank-score based on 5000 survival trees, and therefore, variables important obtained to find the genes that are most influential for CRC survival. We compared the predictive performance of the Cox PH model and RSF for early CRC detection and diagnosis. The results indicate that SLC9A8, IER5, ARSJ, ANKRD27, and PIPOX genes were significantly associated with the CRC overall survival. In addition, age, sex, and stages are also affecting the CRC overall survival. The RSF model using log-rank is better than log-rank-score, while log-rank-score needed more trees to stabilize. Overall, the imputation of missing values enhanced the model's predictive performance. In addition, Cox PH predictive performance was better than RSF., Competing Interests: The authors have declared that no competing interests exist.
- Published
- 2021
- Full Text
- View/download PDF
41. Spatio-temporal modelling of tick life-stage count data with spatially varying coefficients.
- Author
-
Lephoto T, Mwambi H, Bodhlyera O, and Gaff H
- Subjects
- Animals, Humans, Models, Statistical, Spatio-Temporal Analysis, Ticks
- Abstract
There is a vast amount of geo-referenced data in many fields of study including ecological studies. Geo-referencing is usually by point referencing; that is, latitudes and longitudes or by areal referencing, which includes districts, counties, states, provinces and other administrative units. The availability of large geo-referenced datasets for modelling has necessitated the development and application of spatial statistical methods. However, spatial varying coefficients models exploring the abundance of tick counts remain limited. In this study we used data that was collected and prepared by researchers in the Department of Biological Sciences from the Old Dominion University, Virginia, USA. We modelled tick life-stage counts and abundance variability from 12 sampling locations, with 5 different habitats (numbered 1-5), three habitat types; namely: woods, edges and grass; collected monthly from May 2009 through December 2018. Spatio-temporal Poisson and spatio-temporal negative binomial (NB) count data models were fitted to the data and compared using the deviance information criteria (DIC). The NB model outperformed the Poisson models with all its DIC values being smaller than those of the Poisson model. Results showed that the covariates varied spatially across counties. There was a decreasing time (in years) effect over the study period. However, even though the time effect was decreasing over the study period, space-time interaction effects were seen to be increasing over time in York County.
- Published
- 2021
- Full Text
- View/download PDF
42. Identifying Nutrient Patterns in South African Foods to Support National Nutrition Guidelines and Policies.
- Author
-
Balakrishna Y, Manda S, Mwambi H, and van Graan A
- Subjects
- Databases, Factual, Diet, Healthy, Energy Intake, Food, Humans, Malnutrition, Micronutrients, Minerals, Nutritional Status, Obesity, Principal Component Analysis, South Africa, Vitamins, Diet, Feeding Behavior, Nutrients, Nutrition Policy, Nutritive Value
- Abstract
Food composition databases (FCDBs) provide the nutritional content of foods and are essential for developing nutrition guidance and effective intervention programs to improve nutrition of a population. In public and nutritional health research studies, FCDBs are used in the estimation of nutrient intake profiles at the population levels. However, such studies investigating nutrient co-occurrence and profile patterns within the African context are very rare. This study aimed to identify nutrient co-occurrence patterns within the South African FCDB (SAFCDB). A principal component analysis (PCA) was applied to 28 nutrients and 971 foods in the South African FCDB to determine compositionally similar food items. A second principal component analysis was applied to the food items for validation. Eight nutrient patterns (NPs) explaining 73.4% of the nutrient variation among foods were identified: (1) high magnesium and manganese; (2) high copper and vitamin B
12 ; (3) high animal protein, niacin, and vitamin B6 ; (4) high fatty acids and vitamin E; (5) high calcium, phosphorous and sodium; (6) low moisture and high available carbohydrate; (7) high cholesterol and vitamin D; and (8) low zinc and high vitamin C. Similar food patterns (FPs) were identified from a PCA on food items, yielding subgroups such as dark-green, leafy vegetables and, orange-coloured fruit and vegetables. One food pattern was associated with high sodium levels and contained bread, processed meat and seafood, canned vegetables, and sauces. The data-driven nutrient and food patterns found in this study were consistent with and support the South African food-based dietary guidelines and the national salt regulations.- Published
- 2021
- Full Text
- View/download PDF
43. A stacking ensemble deep learning approach to cancer type classification based on TCGA data.
- Author
-
Mohammed M, Mwambi H, Mboya IB, Elbashir MK, and Omolo B
- Subjects
- Humans, Female, Support Vector Machine, Algorithms, Computational Biology methods, Deep Learning, Neoplasms genetics, Neoplasms classification, Neural Networks, Computer
- Abstract
Cancer tumor classification based on morphological characteristics alone has been shown to have serious limitations. Breast, lung, colorectal, thyroid, and ovarian are the most commonly diagnosed cancers among women. Precise classification of cancers into their types is considered a vital problem for cancer diagnosis and therapy. In this paper, we proposed a stacking ensemble deep learning model based on one-dimensional convolutional neural network (1D-CNN) to perform a multi-class classification on the five common cancers among women based on RNASeq data. The RNASeq gene expression data was downloaded from Pan-Cancer Atlas using GDCquery function of the TCGAbiolinks package in the R software. We used least absolute shrinkage and selection operator (LASSO) as feature selection method. We compared the results of the new proposed model with and without LASSO with the results of the single 1D-CNN and machine learning methods which include support vector machines with radial basis function, linear, and polynomial kernels; artificial neural networks; k-nearest neighbors; bagging trees. The results show that the proposed model with and without LASSO has a better performance compared to other classifiers. Also, the results show that the machine learning methods (SVM-R, SVM-L, SVM-P, ANN, KNN, and bagging trees) with under-sampling have better performance than with over-sampling techniques. This is supported by the statistical significance test of accuracy where the p-values for differences between the SVM-R and SVM-P, SVM-R and ANN, SVM-R and KNN are found to be p = 0.003, p = < 0.001, and p = < 0.001, respectively. Also, SVM-L had a significant difference compared to ANN p = 0.009. Moreover, SVM-P and ANN, SVM-P and KNN are found to be significantly different with p-values p = < 0.001 and p = < 0.001, respectively. In addition, ANN and bagging trees, ANN and KNN were found to be significantly different with p-values p = < 0.001 and p = 0.004, respectively. Thus, the proposed model can help in the early detection and diagnosis of cancer in women, and hence aid in designing early treatment strategies to improve survival., (© 2021. The Author(s).)
- Published
- 2021
- Full Text
- View/download PDF
44. Bayesian Spatial Modeling of Anemia among Children under 5 Years in Guinea.
- Author
-
Barry TS, Ngesa O, Onyango NO, and Mwambi H
- Subjects
- Adolescent, Africa, Bayes Theorem, Child, Child, Preschool, Female, Guinea epidemiology, Humans, Infant, Prevalence, Risk Factors, Anemia epidemiology
- Abstract
Anemia is a major public health problem in Africa, affecting an increasing number of children under five years. Guinea is one of the most affected countries. In 2018, the prevalence rate in Guinea was 75% for children under five years. This study sought to identify the factors associated with anemia and to map spatial variation of anemia across the eight (8) regions in Guinea for children under five years, which can provide guidance for control programs for the reduction of the disease. Data from the Guinea Multiple Indicator Cluster Survey (MICS5) 2016 was used for this study. A total of 2609 children under five years who had full covariate information were used in the analysis. Spatial binomial logistic regression methodology was undertaken via Bayesian estimation based on Markov chain Monte Carlo (MCMC) using WinBUGS software version 1.4. The findings in this study revealed that 77% of children under five years in Guinea had anemia, and the prevalences in the regions ranged from 70.32% (Conakry) to 83.60% (NZerekore) across the country. After adjusting for non-spatial and spatial random effects in the model, older children (48-59 months) (OR: 0.47, CI [0.29 0.70]) were less likely to be anemic compared to those who are younger (0-11 months). Children whose mothers had completed secondary school or above had a 33% reduced risk of anemia (OR: 0.67, CI [0.49 0.90]), and children from household heads from the Kissi ethnic group are less likely to have anemia than their counterparts whose leaders are from Soussou (OR: 0.48, CI [0.23 0.92]).
- Published
- 2021
- Full Text
- View/download PDF
45. Short-term real-time prediction of total number of reported COVID-19 cases and deaths in South Africa: a data driven approach.
- Author
-
Reddy T, Shkedy Z, Janse van Rensburg C, Mwambi H, Debba P, Zuma K, and Manda S
- Subjects
- COVID-19 mortality, Humans, Logistic Models, Models, Statistical, South Africa epidemiology, COVID-19 epidemiology, SARS-CoV-2
- Abstract
Background: The rising burden of the ongoing COVID-19 epidemic in South Africa has motivated the application of modeling strategies to predict the COVID-19 cases and deaths. Reliable and accurate short and long-term forecasts of COVID-19 cases and deaths, both at the national and provincial level, are a key aspect of the strategy to handle the COVID-19 epidemic in the country., Methods: In this paper we apply the previously validated approach of phenomenological models, fitting several non-linear growth curves (Richards, 3 and 4 parameter logistic, Weibull and Gompertz), to produce short term forecasts of COVID-19 cases and deaths at the national level as well as the provincial level. Using publicly available daily reported cumulative case and death data up until 22 June 2020, we report 5, 10, 15, 20, 25 and 30-day ahead forecasts of cumulative cases and deaths. All predictions are compared to the actual observed values in the forecasting period., Results: We observed that all models for cases provided accurate and similar short-term forecasts for a period of 5 days ahead at the national level, and that the three and four parameter logistic growth models provided more accurate forecasts than that obtained from the Richards model 10 days ahead. However, beyond 10 days all models underestimated the cumulative cases. Our forecasts across the models predict an additional 23,551-26,702 cases in 5 days and an additional 47,449-57,358 cases in 10 days. While the three parameter logistic growth model provided the most accurate forecasts of cumulative deaths within the 10 day period, the Gompertz model was able to better capture the changes in cumulative deaths beyond this period. Our forecasts across the models predict an additional 145-437 COVID-19 deaths in 5 days and an additional 243-947 deaths in 10 days., Conclusions: By comparing both the predictions of deaths and cases to the observed data in the forecasting period, we found that this modeling approach provides reliable and accurate forecasts for a maximum period of 10 days ahead.
- Published
- 2021
- Full Text
- View/download PDF
46. Developing excellence in biostatistics leadership, training and science in Africa: How the Sub-Saharan Africa Consortium for Advanced Biostatistics (SSACAB) training unites expertise to deliver excellence.
- Author
-
Chirwa TF, Matsena Zingoni Z, Munyewende P, Manda SO, Mwambi H, Kandala NB, Kinyanjui S, Young T, Musenge E, Simbeye J, Musonda P, Mahande MJ, Weke P, Onyango NO, Kazembe L, Tumwesigye NM, Zuma K, Yende-Zuma N, Omanyondo Ohambe MC, Kweku EN, Maposa I, Ayele B, Achia T, Machekano R, Thabane L, Levin J, Eijkemans MJC, Carpenter J, Chasela C, Klipstein-Grobusch K, and Todd J
- Abstract
The increase in health research in sub-Saharan Africa (SSA) has led to a high demand for biostatisticians to develop study designs, contribute and apply statistical methods in data analyses. Initiatives exist to address the dearth in statistical capacity and lack of local biostatisticians in SSA health projects. The Sub-Saharan African Consortium for Advanced Biostatistics (SSACAB) led by African institutions was initiated to improve biostatistical capacity according to the needs identified by African institutions, through collaborative masters and doctoral training in biostatistics. SACCAB has created a critical mass of biostatisticians and a network of institutions over the last five years and has strengthened biostatistics resources and capacity for health research studies in SSA. SSACAB comprises 11 universities and four research institutions which are supported by four European universities. In 2015, only four universities had established Masters programmes in biostatistics and SSACAB supported the remaining seven to develop Masters programmes. In 2019 the University of the Witwatersrand became the first African institution to gain Royal Statistical Society accreditation for a Biostatistics Masters programme. A total of 150 fellows have been awarded scholarships to date of which 123 are Masters fellowships (41 female) of whom 58 have already graduated. Graduates have been employed in African academic (19) and research (15) institutions and 10 have enrolled for PhD studies. A total of 27 (10 female) PhD fellowships have been awarded; 4 of them are due to graduate by 2020. To date, SSACAB Masters and PhD students have published 17 and 31 peer-reviewed articles, respectively. SSACAB has also facilitated well-attended conferences, face-to-face and online short courses. Pooling of limited biostatistics resources in SSA combined with co-funding from external partners has shown to be an effective strategy for the development and teaching of advanced biostatistics methods, supervision and mentoring of PhD candidates., Competing Interests: No competing interests were disclosed., (Copyright: © 2020 Chirwa TF et al.)
- Published
- 2020
- Full Text
- View/download PDF
47. Modelling HIV disease process and progression in seroconversion among South Africa women: using transition-specific parametric multi-state model.
- Author
-
Dessie ZG, Zewotir T, Mwambi H, and North D
- Subjects
- Adult, CD4 Lymphocyte Count, Disease Progression, Female, Humans, Longitudinal Studies, Probability, Prospective Studies, Sexual Partners, South Africa, Viral Load, HIV Infections immunology, Seroconversion
- Abstract
Background: HIV infected patients may experience many intermediate events including between-event transition throughout their follow up. Through modelling these transitions, we can gain a deeper understanding of HIV disease process and progression and of factors that influence the disease process and progression pathway. In this work, we present transition-specific parametric multi-state models to describe HIV disease process and progression., Methods: The data is from an ongoing prospective cohort study conducted amongst adult women who were HIV-infected in KwaZulu-Natal, South Africa. Participants were enrolled during the acute HIV infection phase and then followed up during chronic infection, up to ART initiation., Results: Transition specific distributions for multi-state models, including a variety of accelerated failure time (AFT) models and proportional hazards (PH) models, were presented and compared in this study. The analysis revealed that women enrolling with a CD4 count less than 350 cells/mm
3 (severe and advanced disease stages) had a far lower chance of immune recovery, and a considerably higher chance of immune deterioration, compared to women enrolling with a CD4 count of 350 cells/mm3 or more (normal and mild disease stages). Our analyses also showed that older age, higher educational levels, higher scores for red blood cell counts, higher mononuclear scores, higher granulocytes scores, and higher physical health scores, all had a significant effect on a shortened time to immunological recovery, while women with many sex partners, higher viral load and larger family size had a significant effect on accelerating time to immune deterioration., Conclusion: Multi-state modelling of transition-specific distributions offers a flexible tool for the study of demographic and clinical characteristics' effects on the entire disease progression pathway. It is hoped that the article will help applied researchers to familiarize themselves with the models, including interpretation of results.- Published
- 2020
- Full Text
- View/download PDF
48. Multilevel ordinal model for CD4 count trends in seroconversion among South Africa women.
- Author
-
Dessie ZG, Zewotir T, Mwambi H, and North D
- Subjects
- Adolescent, Adult, Age Factors, CD4 Lymphocyte Count trends, Female, Follow-Up Studies, Humans, Longitudinal Studies, Middle Aged, Prospective Studies, Sexual Partners, South Africa, Viral Load, Young Adult, HIV Infections immunology, Models, Statistical, Multilevel Analysis methods, Seroconversion
- Abstract
Background: Ordinal health longitudinal response variables have distributions that make them unsuitable for many popular statistical models that assume normality. We present a multilevel growth model that may be more suitable for medical ordinal longitudinal outcomes than are statistical models that assume normality and continuous measurements., Methods: The data is from an ongoing prospective cohort study conducted amongst adult women who are HIV-infected patients in Kwazulu-Natal, South Africa. Participants were enrolled into the acute infection, then into early infection subsequently into established infection and afterward on cART. Generalized linear multilevel models were applied., Results: Multilevel ordinal non-proportional and proportional-odds growth models were presented and compared. We observed that the effects of covariates can't be assumed identical across the three cumulative logits. Our analyses also revealed that the rate of change of immune recovery of patients increased as the follow-up time increases. Patients with stable sexual partners, middle-aged, cART initiation, and higher educational levels were more likely to have better immunological stages with time. Similarly, patients having high electrolytes component scores, higher red blood cell indices scores, higher physical health scores, higher psychological well-being scores, a higher level of independence scores, and lower viral load more likely to have better immunological stages through the follow-up time., Conclusion: It can be concluded that the multilevel non-proportional-odds method provides a flexible modeling alternative when the proportional-odds assumption of equal effects of the predictor variables at every stage of the response variable is violated. Having higher clinical parameter scores, higher QoL scores, higher educational levels, and stable sexual partners were found to be the significant factors for trends of CD4 count recovery.
- Published
- 2020
- Full Text
- View/download PDF
49. Modeling Viral Suppression, Viral Rebound and State-Specific Duration of HIV Patients with CD4 Count Adjustment: Parametric Multistate Frailty Model Approach.
- Author
-
Dessie ZG, Zewotir T, Mwambi H, and North D
- Abstract
Introduction: Combination antiretroviral therapy has become the standard care of human immunodeficiency virus (HIV)-infected patients and has further led to a dramatically decreased progression probability to acquired immune deficiency syndrome (AIDS) for patients under such a therapy. However, responses of the patients to this therapy have recorded heterogeneous complexity and high dynamism. In this paper, we simultaneously model long-term viral suppression, viral rebound, and state-specific duration of HIV-infected patients., Methods: Full-parametric and semi-parametric Markov multistate models were applied to assess the effects of covariates namely TB co-infection, educational status, marital status, age, quality of life (QoL) scores, white and red blood cell parameters, and liver enzyme abnormality on long-term viral suppression, viral rebound and state-specific duration for HIV-infected individuals before and after treatment. Furthermore, two models, one including and another excluding the effect of the frailty, were presented and compared in this study., Results: Results from the diagnostic plots, Akaike information criterion (AIC) and likelihood ratio test showed that the Weibull multistate frailty model fitted significantly better than the exponential and semi-parametric multistate models. Viral rebound was found to be significantly associated with many sex partners, higher eosinophils count, younger age, lower educational level, higher monocyte counts, having abnormal neutrophils count, and higher liver enzyme abnormality. Furthermore, viral suppression was also found to be significantly associated with higher QoL scores, and having a stable sex partner. The analysis result also showed that patients with a stable sex partner, higher educational levels, higher QoL scores, lower eosinophils count, lower monocyte counts, and higher RBC indices were more likely to spend more time in undetectable viral load state., Conclusions: To achieve and maintain the UNAIDS 90% suppression targets, additional interventions are required to optimize antiretroviral therapy outcomes, specifically targeting those with poor clinical characteristics, lower education, younger age, and those with many sex partners. From a methodological perspective, the parametric multistate approach with frailty is a flexible approach for modeling time-varying variables, allowing for dealing with heterogeneity between the sequence of transitions, as well as allowing for a reasonable degree of flexibility with a few additional parameters, which then aids in gaining a better insight into how factors change over time.
- Published
- 2020
- Full Text
- View/download PDF
50. Modelling immune deterioration, immune recovery and state-specific duration of HIV-infected women with viral load adjustment: using parametric multistate model.
- Author
-
Dessie ZG, Zewotir T, Mwambi H, and North D
- Subjects
- Adult, Anti-Retroviral Agents therapeutic use, Biomarkers blood, Disease Progression, Female, HIV Infections drug therapy, HIV Infections virology, Humans, Middle Aged, Prospective Studies, Quality of Life, South Africa, CD4 Lymphocyte Count statistics & numerical data, HIV Infections diagnosis, Markov Chains, Models, Statistical, Viral Load statistics & numerical data
- Abstract
Background: CD4 cell and viral load count are highly correlated surrogate markers of human immunodeficiency virus (HIV) disease progression. In modelling the progression of HIV, previous studies mostly dealt with either CD4 cell counts or viral load alone. In this work, both biomarkers are in included one model, in order to study possible factors that affect the intensities of immune deterioration, immune recovery and state-specific duration of HIV-infected women., Methods: The data is from an ongoing prospective cohort study conducted among antiretroviral treatment (ART) naïve HIV-infected women in the province of KwaZulu-Natal, South Africa. Participants were enrolled in the acute HIV infection phase, then followed-up during chronic infection up to ART initiation. Full-parametric and semi-parametric Markov models were applied. Furthermore, the effect of the inclusion and exclusion viral load in the model was assessed., Results: Inclusion of a viral load component improves the efficiency of the model. The analysis results showed that patients who reported a stable sexual partner, having a higher educational level, higher physical health score and having a high mononuclear component score are more likely to spend more time in a good HIV state (particularly normal disease state). Patients with TB co-infection, with anemia, having a high liver abnormality score and patients who reported many sexual partners, had a significant increase in the intensities of immunological deterioration transitions. On the other hand, having high weight, higher education level, higher quality of life score, having high RBC parameters, high granulocyte component scores and high mononuclear component scores, significantly increased the intensities of immunological recovery transitions., Conclusion: Inclusion of both CD4 cell count based disease progression states and viral load, in the time-homogeneous Markov model, assisted in modeling the complete disease progression of HIV/AIDS. Higher quality of life (QoL) domain scores, good clinical characteristics, stable sexual partner and higher educational level were found to be predictive factors for transition and length of stay in sequential adversity of HIV/AIDS.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.