Back to Search
Start Over
A supervised learning framework for chromatin loop detection in genome-wide contact maps
- Source :
- Nature Communications, Vol 11, Iss 1, Pp 1-12 (2020), Nature Communications
- Publication Year :
- 2020
- Publisher :
- Nature Publishing Group, 2020.
-
Abstract
- Accurately predicting chromatin loops from genome-wide interaction matrices such as Hi-C data is critical to deepening our understanding of proper gene regulation. Current approaches are mainly focused on searching for statistically enriched dots on a genome-wide map. However, given the availability of orthogonal data types such as ChIA-PET, HiChIP, Capture Hi-C, and high-throughput imaging, a supervised learning approach could facilitate the discovery of a comprehensive set of chromatin interactions. Here, we present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps. We compare Peakachu with current enrichment-based approaches, and find that Peakachu identifies a unique set of short-range interactions. We show that our models perform well in different platforms, across different sequencing depths, and across different species. We apply this framework to predict chromatin loops in 56 Hi-C datasets, and release the results at the 3D Genome Browser.<br />Predicting chromatin loops from genome-wide interaction matrices such as Hi-C data provides insight into gene regulation events. Here, the authors present Peakachu, a Random Forest classification framework that predicts chromatin loops from genome-wide contact maps, and apply it to systematically predict chromatin loops in 56 Hi-C datasets, with results available at the 3D Genome Browser.
- Subjects :
- 0301 basic medicine
Sprite (computer graphics)
Computer science
Science
General Physics and Astronomy
Computational biology
Genome browser
Genome informatics
computer.software_genre
Chromatin structure
Data type
Genome
Article
General Biochemistry, Genetics and Molecular Biology
Gene regulatory networks
Set (abstract data type)
chemistry.chemical_compound
03 medical and health sciences
0302 clinical medicine
Species Specificity
Databases, Genetic
Humans
lcsh:Science
030304 developmental biology
Regulation of gene expression
0303 health sciences
Multidisciplinary
Supervised learning
Sequence Analysis, DNA
General Chemistry
Chromatin
Random forest
030104 developmental biology
chemistry
Organ Specificity
Chromatin Loop
lcsh:Q
Supervised Machine Learning
Data mining
K562 Cells
computer
DNA
030217 neurology & neurosurgery
Subjects
Details
- Language :
- English
- ISSN :
- 20411723
- Volume :
- 11
- Issue :
- 1
- Database :
- OpenAIRE
- Journal :
- Nature Communications
- Accession number :
- edsair.doi.dedup.....5b8a14dc4070f8a0e5f954b4c0a809e7