1. Improved Methods for Template-Matching in Electron-Density Maps Using Spherical Harmonics
- Author
-
Frank DiMaio, Ameet Soni, George N. Phillips, and Jude W. Shavlik
- Subjects
Discrete mathematics ,Combinatorics ,Sequence ,symbols.namesake ,Markov chain ,Variable-order Markov model ,String (computer science) ,symbols ,Markov process ,Order (ring theory) ,Additive Markov chain ,Omega ,Mathematics - Abstract
Genomic sequences display characteristic features at various scales ranging from oligonucleotide frequencies to large organizational units such as genes. The generation of such a sequence, defined as a string over the alphabet SigmaDNA={A C, T, G}, can be approximated by a formal machine, a Markov chain having strings as states, whose parameters lend unique characteristics to the sequence. We present a formal mathematical framework that analyzes this approximation in terms of transition probabilities of words at various scales. Within this framework, we present an algorithm that estimates the order of the Markov chain of order omega hypothesized to have generated a sequence S at hand. Consider the probability of transition from string alpha to string gamma both of length omega, computed using both order w and order omega-1 Markov chains. The expected difference of the two probabilities thus obtained, is zero, and demonstrates a sharp positive transition for values of omega between omega and omega+1. Both mathematical and experimental results are obtained that explain the general behavior of the algorithm.
- Published
- 2007