Back to Search
Start Over
Identification of Enriched Regions in ChIP-Seq Data via a Linear-Time Multi-Level Thresholding Algorithm
- Source :
- IEEE/ACM Transactions on Computational Biology and Bioinformatics. 19:2842-2850
- Publication Year :
- 2022
- Publisher :
- Institute of Electrical and Electronics Engineers (IEEE), 2022.
-
Abstract
- Chromatin immunoprecipitation (ChIP-Seq) has emerged as a superior alternative to microarray technology as it provides higher resolution, less noise, greater coverage and wider dynamic range. While ChIP-Seq enables probing of DNA-protein interaction over the entire genome, it requires the use of sophisticated tools to recognize hidden patterns and extract meaningful data. Over the years, various attempts have resulted in several algorithms making use of different heuristics to accurately determine individual peaks corresponding to unique DNA-protein. However, finding all the significant peaks with high accuracy in a reasonable time is still a challenge. In this work, we propose the use of Multi-level thresholding algorithm, which we call LinMLTBS, used to identify the enriched regions on ChIP-Seq data. Although various suboptimal heuristics have been proposed for multi-level thresholding, we emphasize on the use of an algorithm capable of obtaining an optimal solution, while maintaining linear-time complexity. Testing various algorithm on various ENCODE project datasets shows that our approach attains higher accuracy relative to previously proposed peak finders while retaining a reasonable processing speed.
- Subjects :
- Chromatin Immunoprecipitation
Binding Sites
Dynamic range
Computer science
business.industry
Applied Mathematics
0206 medical engineering
Pattern recognition
DNA
Sequence Analysis, DNA
02 engineering and technology
ENCODE
Chip
Thresholding
Identification (information)
Genetics
Chromatin Immunoprecipitation Sequencing
Artificial intelligence
Noise (video)
business
Heuristics
Time complexity
Algorithms
020602 bioinformatics
Biotechnology
Subjects
Details
- ISSN :
- 23740043 and 15455963
- Volume :
- 19
- Database :
- OpenAIRE
- Journal :
- IEEE/ACM Transactions on Computational Biology and Bioinformatics
- Accession number :
- edsair.doi.dedup.....038c0ea99fb7035da47fad0d2539a8f2