Back to Search Start Over

Factors That Influence the Choice of Markov Model Order in Discriminating DNA Sequences from Different Sources

Authors :
Ravi S. Pandey
Rajeev K. Azad
Source :
OMICS: A Journal of Integrative Biology. 26:348-355
Publication Year :
2022
Publisher :
Mary Ann Liebert Inc, 2022.

Abstract

Markov models have frequently been used in genetic sequence analysis. The number of parameters of a Markov model increases exponentially with model order, so it is often recommended that the order be chosen based on the size of data being modeled, lower orders for small and higher orders for large dataset sizes. Approaches based on model selection criterion have also been proposed. An important problem in microbiology and evolutionary biology is to decipher chimeric genomes of microbes, particularly, identify segments of distinct ancestries in genomes and reconstruct the plausible evolutionary scenarios that might have shaped the chimeric genomes in the microbial world. In this study, we assessed a Markov model-based segmentation method for its ability to detect compositionally disparate segments in chimeric sequence constructs as a function of model order, sequence length, and phylogenetic divergence. Our results show that the choice of Markov model order depends on both sequence size and composition. Higher order Markov models were found to be more effective in delineating sequence segments arising from closely related organisms in longer constructs; on the other hand, lower order Markov models were found to be more appropriate in delineating sequence segments arising from distantly related organisms in shorter constructs. These findings are important and timely, with broad implications in fields such as epidemiology that has to deal with the emergence of novel pathogenic chimeras that arise by foreign DNA acquisition, and ecology where chimeric structures may arise in various ecosystems, necessitating more robust approaches for their deconstruction and interpretation.

Details

ISSN :
15578100
Volume :
26
Database :
OpenAIRE
Journal :
OMICS: A Journal of Integrative Biology
Accession number :
edsair.doi.dedup.....a0e3e03ecdcbc6b77a48ce38930761f1
Full Text :
https://doi.org/10.1089/omi.2022.0043