Back to Search Start Over

Identification of haploinsufficient genes from epigenomic data using deep forest.

Authors :
Yang, Yuning
Li, Shaochuan
Wang, Yunhe
Ma, Zhiqiang
Wong, Ka-Chun
Li, Xiangtao
Source :
Briefings in Bioinformatics. Sep2021, Vol. 22 Issue 5, p1-11. 11p.
Publication Year :
2021

Abstract

Haploinsufficiency, wherein a single allele is not enough to maintain normal functions, can lead to many diseases including cancers and neurodevelopmental disorders. Recently, computational methods for identifying haploinsufficiency have been developed. However, most of those computational methods suffer from study bias, experimental noise and instability, resulting in unsatisfactory identification of haploinsufficient genes. To address those challenges, we propose a deep forest model, called HaForest, to identify haploinsufficient genes. The multiscale scanning is proposed to extract local contextual representations from input features under Linear Discriminant Analysis. After that, the cascade forest structure is applied to obtain the concatenated features directly by integrating decision-tree-based forests. Meanwhile, to exploit the complex dependency structure among haploinsufficient genes, the LightGBM library is embedded into HaForest to reveal the highly expressive features. To validate the effectiveness of our method, we compared it to several computational methods and four deep learning algorithms on five epigenomic data sets. The results reveal that HaForest achieves superior performance over the other algorithms, demonstrating its unique and complementary performance in identifying haploinsufficient genes. The standalone tool is available at https://github.com/yangyn533/HaForest. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
14675463
Volume :
22
Issue :
5
Database :
Academic Search Index
Journal :
Briefings in Bioinformatics
Publication Type :
Academic Journal
Accession number :
152975130
Full Text :
https://doi.org/10.1093/bib/bbaa393