Back to Search
Start Over
BLSSpeller: exhaustive comparative discovery of conserved cis-regulatory elements
- Source :
- Bioinformatics, BIOINFORMATICS
- Publication Year :
- 2015
- Publisher :
- Oxford University Press (OUP), 2015.
-
Abstract
- Motivation: The accurate discovery and annotation of regulatory elements remains a challenging problem. The growing number of sequenced genomes creates new opportunities for comparative approaches to motif discovery. Putative binding sites are then considered to be functional if they are conserved in orthologous promoter sequences of multiple related species. Existing methods for comparative motif discovery usually rely on pregenerated multiple sequence alignments, which are difficult to obtain for more diverged species such as plants. As a consequence, misaligned regulatory elements often remain undetected. Results: We present a novel algorithm that supports both alignment-free and alignment-based motif discovery in the promoter sequences of related species. Putative motifs are exhaustively enumerated as words over the IUPAC alphabet and screened for conservation using the branch length score. Additionally, a confidence score is established in a genome-wide fashion. In order to take advantage of a cloud computing infrastructure, the MapReduce programming model is adopted. The method is applied to four monocotyledon plant species and it is shown that high-scoring motifs are significantly enriched for open chromatin regions in Oryza sativa and for transcription factor binding sites inferred through protein-binding microarrays in O.sativa and Zea mays. Furthermore, the method is shown to recover experimentally profiled ga2ox1-like KN1 binding sites in Z.mays. Availability and implementation: BLSSpeller was written in Java. Source code and manual are available at http://bioinformatics.intec.ugent.be/blsspeller Contact: Klaas.Vandepoele@psb.vib-ugent.be or jan.fostier@intec.ugent.be Supplementary information: Supplementary data are available at Bioinformatics online.
- Subjects :
- COMPARATIVE GENOMICS
Statistics and Probability
GENES
DNA, Plant
FACTOR-BINDING SITES
Sequence alignment
MOTIF DISCOVERY
Computational biology
Biology
Biochemistry
Genome
Conserved sequence
Annotation
Nucleotide Motifs
Promoter Regions, Genetic
Molecular Biology
Gene
Conserved Sequence
TOOLS
Comparative genomics
Genetics
Binding Sites
IDENTIFICATION
Base Sequence
Biology and Life Sciences
DNA
Sequence Analysis, DNA
Original Papers
Computer Science Applications
DNA binding site
ALIGNMENT
Computational Mathematics
SYSTEMATIC DISCOVERY
Computational Theory and Mathematics
DNA microarray
PLANT GENOMES
Sequence Analysis
Sequence Alignment
Algorithms
Genome, Plant
Software
Transcription Factors
Subjects
Details
- ISSN :
- 13674811, 13674803, and 14602059
- Volume :
- 31
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....d1b5591d4100ce942779167410d80bc8