Nicola J. Mulder, Ezekiel Adebiyi, Marion Adebiyi, Seun Adeyemi, Azza Ahmed, Rehab Ahmed, Bola Akanle, Mohamed Alibi, Don L. Armstrong, Shaun Aron, Efejiro Ashano, Shakuntala Baichoo, Alia Benkahla, David K. Brown, Emile R. Chimusa, Faisal M. Fadlelmola, Dare Falola, Segun Fatumo, Kais Ghedira, Amel Ghouila, Scott Hazelhurst, Itunuoluwa Isewon, Segun Jung, Samar Kamal Kassim, Jonathan K. Kayondo, Mamana Mbiyavanga, Ayton Meintjes, Somia Mohammed, Abayomi Mosaku, Ahmed Moussa, Mustafa Muhammd, Zahra Mungloo-Dilmohamud, Oyekanmi Nashiru, Trust Odia, Adaobi Okafor, Olaleye Oladipo, Victor Osamor, Jellili Oyelade, Khalid Sadki, Samson Pandam Salifu, Jumoke Soyemi, Sumir Panji, Fouzia Radouani, Oussama Souiai, Özlem Tastan Bishop, The HABioNet Consortium, as Members of the HAfrica Consortium, University of Cape Town, Department of Computer and Information Sciences, Covenant University, Covenant University Bioinformatics Research (CUBRe), University of Khartoum, Laboratoire de Bioinformatique, biomathématiques, biostatistiques (BIMS) (LR11IPT09), Institut Pasteur de Tunis, Réseau International des Instituts Pasteur (RIIP)-Réseau International des Instituts Pasteur (RIIP)-Université de Tunis El Manar (UTM), University of Illinois at Urbana-Champaign [Urbana], University of Illinois System, University of the Witwatersrand [Johannesburg] (WITS), Federal Ministry of Science and Technology [Abuja] (FMST), University of Mauritius, Rhodes University, Grahamstown, Institute of Infectious Diseases and Molecular Medicine (IDM), Future University of Sudan, Laboratoire de Transmission, Contrôle et Immunobiologie des Infections - Laboratory of Transmission, Control and Immunobiology of Infection (LR11IPT02), Réseau International des Instituts Pasteur (RIIP)-Réseau International des Instituts Pasteur (RIIP), Computation Institute [Chicago], University of Chicago, Université Ain Shams, Uganda Virus Research Institute (UVRI), Laboratoire des Technologies de l'Information et de la Communication [Tanger] (Labtic), Ecole Nationale des Sciences Appliquées [Tanger] (ENSAT), Landmark University [Omu-Aran], Université Mohammed V, Kwame Nkrumah University of Science and Technology [GHANA] (KNUST), École polytechnique fédérale d'Ilaro, Institut Pasteur du Maroc, Réseau International des Instituts Pasteur (RIIP), and H3ABioNet is supported by the National Institutes of Health Common Fund (grant number U41HG006941)
Background: Although pockets of bioinformatics excellence have developed in Africa, generally, large-scale genomic data analysis has been limited by the availability of expertise and infrastructure. H3ABioNet, a pan-African bioinformatics network, was established to build capacity specifically to enable H3Africa (Human Heredity and Health in Africa) researchers to analyze their data in Africa. Since the inception of the H3Africa initiative, H3ABioNet’s role has evolved in response to changing needs from the consortium and the African bioinformatics community.Objectives: H3ABioNet set out to develop core bioinformatics infrastructure and capacity for genomics research in various aspects of data collection, transfer, storage, and analysis.Methods and Results: Various resources have been developed to address genomic data management and analysis needs of H3Africa researchers and other scientific communities on the continent. NetMap was developed and used to build an accurate picture of network performance within Africa and between Africa and the rest of the world, and Globus Online has been rolled out to facilitate data transfer. A participant recruitment database was developed to monitor participant enrollment, and data is being harmonized through the use of ontologies and controlled vocabularies. The standardized metadata will be integrated to provide a search facility for H3Africa data and biospecimens. Because H3Africa projects are generating large-scale genomic data, facilities for analysis and interpretation are critical. H3ABioNet is implementing several data analysis platforms that provide a large range of bioinformatics tools or workflows, such as Galaxy, the Job Management System, and eBiokits. A set of reproducible, portable, and cloud-scalable pipelines to support the multiple H3Africa data types are also being developed and dockerized to enable execution on multiple computing infrastructures. In addition, new tools have been developed for analysis of the uniquely divergent African data and for downstream interpretation of prioritized variants. To provide support for these and other bioinformatics queries, an online bioinformatics helpdesk backed by broad consortium expertise has been established. Further support is provided by means of various modes of bioinformatics training.Conclusions: For the past 4 years, the development of infrastructure support and human capacity through H3ABioNet, have significantly contributed to the establishment of African scientific networks, data analysis facilities, and training programs. Here, we describe the infrastructure and how it has affected genomics and bioinformatics research in Africa.HighlightsH3ABioNet is building capacity to enable analysis of genomic data in Africa.Infrastructure has been built for clinical and genomic data storage, management, and analysis.New algorithms and pipelines for African genomic data analysis have been developed.Data are being harmonized using ontologies to enable easy search and retrieval.Genomics training is implemented using various online and face-to-face approaches.