Author: "Sakhanenko, Nikita A." - Searchworks@Jio Institute Digital Library Search Results

1. ApoE Modifier Alleles for Alzheimer's Disease Discovered by Information Theory Dependency Measures: MIST Software Package.

Author: Banman, Andrew, Sakhanenko, Nikita A., Kunert-graf, James, and Galas, David J.
Subjects: *ALZHEIMER'S disease, *INTEGRATED software, *INFORMATION theory, *APOLIPOPROTEIN E, *ALLELES
Abstract: Information theory-based measures of variable dependency (previously published) have been implemented into a software package, MIST. The design of the software and its potential uses are described, and a demonstration is presented in the discovery of modifier alleles of the ApoE gene in affecting Alzheimer's disease (AD) by analyzing the UK Biobank dataset. The modifier genes uncovered overlap strongly with genes found to be associated with AD. Others include many known to influence AD. We discuss a range of uses of the dependency calculations using MIST that can uncover additional genetic effects in similar complex datasets, like higher degrees of interaction and phenotypic pleiotropy. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

2. Optimized permutation testing for information theoretic measures of multi-gene interactions.

Author: Kunert-Graf, James M., Sakhanenko, Nikita A., and Galas, David J.
Subjects: *PERMUTATIONS, *INFORMATION measurement, *PHENOTYPES, *MULTIVARIABLE testing, *INFORMATION theory
Abstract: Background: Permutation testing is often considered the "gold standard" for multi-test significance analysis, as it is an exact test requiring few assumptions about the distribution being computed. However, it can be computationally very expensive, particularly in its naive form in which the full analysis pipeline is re-run after permuting the phenotype labels. This can become intractable in multi-locus genome-wide association studies (GWAS), in which the number of potential interactions to be tested is combinatorially large. Results: In this paper, we develop an approach for permutation testing in multi-locus GWAS, specifically focusing on SNP–SNP-phenotype interactions using multivariable measures that can be computed from frequency count tables, such as those based in Information Theory. We find that the computational bottleneck in this process is the construction of the count tables themselves, and that this step can be eliminated at each iteration of the permutation testing by transforming the count tables directly. This leads to a speed-up by a factor of over 103 for a typical permutation test compared to the naive approach. Additionally, this approach is insensitive to the number of samples making it suitable for datasets with large number of samples. Conclusions: The proliferation of large-scale datasets with genotype data for hundreds of thousands of individuals enables new and more powerful approaches for the detection of multi-locus genotype-phenotype interactions. Our approach significantly improves the computational tractability of permutation testing for these studies. Moreover, our approach is insensitive to the large number of samples in these modern datasets. The code for performing these computations and replicating the figures in this paper is freely available at https://github.com/kunert/permute-counts. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

3. Computational Inference Software for Tetrad Assembly from Randomly Arrayed Yeast Colonies.

Author: Sakhanenko, Nikita A., Cromie, Gareth A., Dudley, Aimée M., and Galas, David J.
Subjects: *NUCLEOTIDE sequence, *CENTROMERE, *COMPUTER software, *COLONIES, *YEAST
Abstract: We describe an information-theory-based method and associated software for computationally identifying sister spores derived from the same meiotic tetrad. The method exploits specific DNA sequence features of tetrads that result from meiotic centromere and allele segregation patterns. Because the method uses only the genomic sequence, it alleviates the need for tetrad-specific barcodes or other genetic modifications to the strains. Using this method, strains derived from randomly arrayed spores can be efficiently grouped back into tetrads. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

4. The Information Content of Discrete Functions and Their Application in Genetic Data Analysis.

Author: Sakhanenko, Nikita A., Kunert-Graf, James, and Galas, David J.
Subjects: *DEPENDENT variables, *DATA analysis, *MATHEMATICAL variables, *DISCRETE geometry, *INFORMATION theory
Abstract: The complex of central problems in data analysis consists of three components: (1) detecting the dependence of variables using quantitative measures, (2) defining the significance of these dependence measures, and (3) inferring the functional relationships among dependent variables. We have argued previously that an information theory approach allows separation of the detection problem from the inference of functional form problem. We approach here the third component of inferring functional forms based on information encoded in the functions. We present here a direct method for classifying the functional forms of discrete functions of three variables represented in data sets. Discrete variables are frequently encountered in data analysis, both as the result of inherently categorical variables and from the binning of continuous numerical variables into discrete alphabets of values. The fundamental question of how much information is contained in a given function is answered for these discrete functions, and their surprisingly complex relationships are illustrated. The all-important effect of noise on the inference of function classes is found to be highly heterogeneous and reveals some unexpected patterns. We apply this classification approach to an important area of biological data analysis-that of inference of genetic interactions. Genetic analysis provides a rich source of real and complex biological data analysis problems, and our general methods provide an analytical basis and tools for characterizing genetic problems and for analyzing genetic data. We illustrate the functional description and the classes of a number of common genetic interaction modes and also show how different modes vary widely in their sensitivity to noise. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

5. Complexity and Vulnerability Analysis of the C. Elegans Gap Junction Connectome.

Author: Kunert-Graf, James M., Sakhanenko, Nikita A., and Galas, David J.
Subjects: *COMPUTATIONAL complexity, *GAP junctions (Cell biology), *CAENORHABDITIS elegans, *SOMATIC cells, *INTERNEURONS
Abstract: We apply a network complexity measure to the gap junction network of the somatic nervous system of C. elegans and find that it possesses a much higher complexity than we might expect from its degree distribution alone. This "excess" complexity is seen to be caused by a relatively small set of connections involving command interneurons. We describe a method which progressively deletes these "complexity-causing" connections, and find that when these are eliminated, the network becomes significantly less complex than a random network. Furthermore, this result implicates the previously-identified set of neurons from the synaptic network's "rich club" as the structural components encoding the network's excess complexity. This study and our method thus support a view of the gap junction Connectome as consisting of a rather low-complexity network component whose symmetry is broken by the unique connectivities of singularly important rich club neurons, sharply increasing the complexity of the network. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

6. Biological Data Analysis as an Information Theory Problem: Multivariable Dependence Measures and the Shadows Algorithm.

Author: Sakhanenko, Nikita A. and Galas, David J.
Subjects: *COMPUTATIONAL biology, *SYSTEMS biology, *BIOINFORMATICS, *COMPUTER simulation of biological systems, *APPLIED mathematics
Abstract: Information theory is valuable in multiple-variable analysis for being model-free and nonparametric, and for the modest sensitivity to undersampling. We previously introduced a general approach to finding multiple dependencies that provides accurate measures of levels of dependency for subsets of variables in a data set, which is significantly nonzero only if the subset of variables is collectively dependent. This is useful, however, only if we can avoid a combinatorial explosion of calculations for increasing numbers of variables. The proposed dependence measure for a subset of variables, τ , differential interaction information, Δ( τ ), has the property that for subsets of τ some of the factors of Δ( τ ) are significantly nonzero, when the full dependence includes more variables. We use this property to suppress the combinatorial explosion by following the 'shadows' of multivariable dependency on smaller subsets. Rather than calculating the marginal entropies of all subsets at each degree level, we need to consider only calculations for subsets of variables with appropriate 'shadows.' The number of calculations for n variables at a degree level of d grows therefore, at a much smaller rate than the binomial coefficient ( n, d), but depends on the parameters of the 'shadows' calculation. This approach, avoiding a combinatorial explosion, enables the use of our multivariable measures on very large data sets. We demonstrate this method on simulated data sets, and characterize the effects of noise and sample numbers. In addition, we analyze a data set of a few thousand mutant yeast strains interacting with a few thousand chemical compounds. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

7. Describing the Complexity of Systems: Multivariable 'Set Complexity' and the Information Basis of Systems Biology.

Author: Galas, David J., Sakhanenko, Nikita A., Skupin, Alexander, and Ignac, Tomasz
Subjects: *SYSTEMS biology, *GENE regulatory networks, *SET theory, *ENTROPY (Information theory), *BIOLOGICAL systems, *HYPERGRAPHS
Abstract: Context dependence is central to the description of complexity. Keying on the pairwise definition of 'set complexity,' we use an information theory approach to formulate general measures of systems complexity. We examine the properties of multivariable dependency starting with the concept of interaction information. We then present a new measure for unbiased detection of multivariable dependency, 'differential interaction information.' This quantity for two variables reduces to the pairwise 'set complexity' previously proposed as a context-dependent measure of information in biological systems. We generalize it here to an arbitrary number of variables. Critical limiting properties of the 'differential interaction information' are key to the generalization. This measure extends previous ideas about biological information and provides a more sophisticated basis for the study of complexity. The properties of 'differential interaction information' also suggest new approaches to data analysis. Given a data set of system measurements, differential interaction information can provide a measure of collective dependence, which can be represented in hypergraphs describing complex system interaction patterns. We investigate this kind of analysis using simulated data sets. The conjoining of a generalized set complexity measure, multivariable dependency analysis, and hypergraphs is our central result. While our focus is on complex biological systems, our results are applicable to any complex system. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

8. Probabilistic Logic Methods and Some Applications to Biology and Medicine.

Author: Sakhanenko, Nikita A. and Galas, David J.
Subjects: *COMPUTATIONAL biology, *HIDDEN Markov models, *BIOINFORMATICS, *PROBABILISTIC inference, *MACHINE learning, *PROBABILITY theory
Abstract: For the computational analysis of biological problems-analyzing data, inferring networks and complex models, and estimating model parameters-it is common to use a range of methods based on probabilistic logic constructions, sometimes collectively called machine learning methods. Probabilistic modeling methods such as Bayesian Networks (BN) fall into this class, as do Hierarchical Bayesian Networks (HBN), Probabilistic Boolean Networks (PBN), Hidden Markov Models (HMM), and Markov Logic Networks (MLN). In this review, we describe the most general of these (MLN), and show how the above-mentioned methods are related to MLN and one another by the imposition of constraints and restrictions. This approach allows us to illustrate a broad landscape of constructions and methods, and describe some of the attendant strengths, weaknesses, and constraints of many of these methods. We then provide some examples of their applications to problems in biology and medicine, with an emphasis on genetics. The key concepts needed to picture this landscape of methods are the ideas of probabilistic graphical models, the structures of the graphs, and the scope of the logical language repertoire used (from First-Order Logic [FOL] to Boolean logic.) These concepts are interlinked and together define the nature of each of the probabilistic logic methods. Finally, we discuss the initial applications of MLN to genetics, show the relationship to less general methods like BN, and then mention several examples where such methods could be effective in new applications to specific biological and medical problems. [ABSTRACT FROM AUTHOR]
Published: 2012
Full Text: View/download PDF

9. PREDICTIONS AND DIAGNOSTICS IN EXPERIMENTAL DATA USING SUPPORT VECTOR REGRESSION.

Author: SAKHANENKO, NIKITA A., LUGER, GEORGE F., MAKARUK, HANNA E., and HOLTKAMP, DAVID B.
Subjects: *LOGIC machines, *MACHINE theory, *ARTIFICIAL intelligence, *ELECTRONIC data processing, *PHYSICAL sciences, *PHYSICS
Abstract: In this paper we present a novel support vector machine (SVM) based framework for prognosis and diagnosis. We apply the framework to sparse physics data sets, although the method can easily be extended to other domains. Experiments in applied fields, such as experimental physics, are often complicated and expensive. As a result, experimentalists are unable to conduct as many experiments as they would like, leading to very unbalanced data sets that can be dense in one dimension and very sparse in others. Our method predicts the data values along the sparse dimension providing more information to researchers. Often experiments deviate from expectations due to small misalignments in initial parameters. Our method detects these outlier experiments. [ABSTRACT FROM AUTHOR]
Published: 2009
Full Text: View/download PDF

10. SHOCK PHYSICS DATA RECONSTRUCTION USING SUPPORT VECTOR REGRESSION.

Author: Sakhanenko, Nikita A., Luger, George F., Makaruk, Hanna E., Aubrey, Joysree B., and Holtkamp, David B.
Subjects: *SHOCK waves, *REGRESSION analysis, *EXTRAPOLATION, *BLAST effect, *PHYSICS experiments
Abstract: This paper considers a set of shock physics experiments that investigate how materials respond to the extremes of deformation, pressure, and temperature when exposed to shock waves. Due to the complexity and the cost of these tests, the available experimental data set is often very sparse. A support vector machine (SVM) technique for regression is used for data estimation of velocity measurements from the underlying experiments. Because of good generalization performance, the SVM method successfully interpolates the experimental data. The analysis of the resulting velocity surface provides more information on the physical phenomena of the experiment. Additionally, the estimated data can be used to identify outlier data sets, as well as to increase the understanding of the other data from the experiment. [ABSTRACT FROM AUTHOR]
Published: 2006
Full Text: View/download PDF

11. Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information.

Author: Kunert-Graf, James, Sakhanenko, Nikita, and Galas, David
Subjects: *INFORMATION theory, *CONSTRAINED optimization, *LINKAGE disequilibrium, *OPEN-ended questions, *GENERALIZATION, *NEUROSCIENCES
Abstract: Information theory provides robust measures of multivariable interdependence, but classically does little to characterize the multivariable relationships it detects. The Partial Information Decomposition (PID) characterizes the mutual information between variables by decomposing it into unique, redundant, and synergistic components. This has been usefully applied, particularly in neuroscience, but there is currently no generally accepted method for its computation. Independently, the Information Delta framework characterizes non-pairwise dependencies in genetic datasets. This framework has developed an intuitive geometric interpretation for how discrete functions encode information, but lacks some important generalizations. This paper shows that the PID and Delta frameworks are largely equivalent. We equate their key expressions, allowing for results in one framework to apply towards open questions in the other. For example, we find that the approach of Bertschinger et al. is useful for the open Information Delta question of how to deal with linkage disequilibrium. We also show how PID solutions can be mapped onto the space of delta measures. Using Bertschinger et al. as an example solution, we identify a specific plane in delta-space on which this approach's optimization is constrained, and compute it for all possible three-variable discrete functions of a three-letter alphabet. This yields a clear geometric picture of how a given solution decomposes information. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

12. Symmetries among Multivariate Information Measures Explored Using Möbius Operators.

Author: Galas, David J. and Sakhanenko, Nikita A.
Subjects: *LATTICE theory, *OPERATOR algebras, *MATHEMATICAL functions, *FINITE element method, *MATHEMATICAL variables
Abstract: Relations between common information measures include the duality relations based on Möbius inversion on lattices, which are the direct consequence of the symmetries of the lattices of the sets of variables (subsets ordered by inclusion). In this paper we use the lattice and functional symmetries to provide a unifying formalism that reveals some new relations and systematizes the symmetries of the information functions. To our knowledge, this is the first systematic examination of the full range of relationships of this class of functions. We define operators on functions on these lattices based on the Möbius inversions that map functions into one another, which we call Möbius operators, and show that they form a simple group isomorphic to the symmetric group S3. Relations among the set of functions on the lattice are transparently expressed in terms of the operator algebra, and, when applied to the information measures, can be used to derive a wide range of relationships among diverse information measures. The Möbius operator algebra is then naturally generalized which yields an even wider range of new relationships. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

13. Multivariate Analysis of Data Sets with Missing Values: An Information Theory-Based Reliability Function.

Author: Uechi, Lisa, Galas, David J., and Sakhanenko, Nikita A.
Subjects: *BIOLOGICAL systems, *INFORMATION theory, *MISSING data (Statistics), *NUMERICAL analysis, *METAPHOR
Abstract: Missing values in complex biological data sets have significant impacts on our ability to correctly detect and quantify interactions in biological systems and to infer relationships accurately. In this article, we propose a useful metaphor to show that information theory measures, such as mutual information and interaction information, can be employed directly for evaluating multivariable dependencies even if data contain some missing values. The metaphor is that of thinking of variable dependencies as information channels between and among variables. In this view, missing data can be thought of as noise that reduces the channel capacity in predictable ways. We extract the available information in the data even if there are missing values and use the notion of channel capacity to assess the reliability of the result. This avoids the common practice—in the absence of prior knowledge of random imputation—of eliminating samples entirely, thus losing the information they can provide. We show how this reliability function can be implemented for pairs of variables, and generalize it for an arbitrary number of variables. Illustrations of the reliability functions for several cases are provided using simulated data. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

14. Discovering Pair-Wise Genetic Interactions: An Information Theory-Based Approach.

Author: Ignac, Tomasz M., Skupin, Alexander, Sakhanenko, Nikita A., and Galas, David J.
Subjects: *PHENOTYPES, *INFORMATION theory, *ENVIRONMENTAL health, *HUMAN genetic variation, *LIPIDS, *LABORATORY mice
Abstract: Phenotypic variation, including that which underlies health and disease in humans, results in part from multiple interactions among both genetic variation and environmental factors. While diseases or phenotypes caused by single gene variants can be identified by established association methods and family-based approaches, complex phenotypic traits resulting from multi-gene interactions remain very difficult to characterize. Here we describe a new method based on information theory, and demonstrate how it improves on previous approaches to identifying genetic interactions, including both synthetic and modifier kinds of interactions. We apply our measure, called interaction distance, to previously analyzed data sets of yeast sporulation efficiency, lipid related mouse data and several human disease models to characterize the method. We show how the interaction distance can reveal novel gene interaction candidates in experimental and simulated data sets, and outperforms other measures in several circumstances. The method also allows us to optimize case/control sample composition for clinical studies. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

15. Toward an Information Theory of Quantitative Genetics.

Author: Galas, David J., Kunert-graf, James, Uechi, Lisa, and Sakhanenko, Nikita A.
Subjects: *QUANTITATIVE genetics, *BASE pairs, *PHENOTYPES, *INFORMATION measurement, *INFORMATION theory, *HERITABILITY, *EUGENICS
Abstract: Quantitative genetics has evolved dramatically in the past century, and the proliferation of genetic data, in quantity as well as type, enables the characterization of complex interactions and mechanisms beyond the scope of its theoretical foundations. In this article, we argue that revisiting the framework for analysis is important and we begin to lay the foundations of an alternative formulation of quantitative genetics based on information theory. Information theory can provide sensitive and unbiased measures of statistical dependencies among variables, and it provides a natural mathematical language for an alternative view of quantitative genetics. In the previous work, we examined the information content of discrete functions and applied this approach and methods to the analysis of genetic data. In this article, we present a framework built around a set of relationships that both unifies the information measures for the discrete functions and uses them to express key quantitative genetic relationships. Information theory measures of variable interdependency are used to identify significant interactions, and a general approach is described for inferring functional relationships in genotype and phenotype data. We present information-based measures of the genetic quantities: penetrance, heritability, and degrees of statistical epistasis. Our scope here includes the consideration of both two- and three-variable dependencies and independently segregating variants, which captures additive effects, genetic interactions, and two-phenotype pleiotropy. This formalism and the theoretical approach naturally apply to higher multivariable interactions and complex dependencies, and can be adapted to account for population structure, linkage, and nonrandomly segregating markers. This article thus focuses on presenting the initial groundwork for a full formulation of quantitative genetics based on information theory. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

16. Children's erythrocyte fatty acids are associated with the risk of islet autoimmunity.

Author: Niinistö, Sari, Erlund, Iris, Lee, Hye-Seung, Uusitalo, Ulla, Salminen, Irma, Aronsson, Carin Andrén, Parikh, Hemang M., Liu, Xiang, Hummel, Sandra, Toppari, Jorma, She, Jin-Xiong, Lernmark, Åke, Ziegler, Annette G., Rewers, Marian, Akolkar, Beena, Krischer, Jeffrey P., Galas, David, Das, Siba, Sakhanenko, Nikita, and Rich, Stephen S.
Subjects: *FATTY acids, *AUTOIMMUNITY, *CHILDREN'S health, *EICOSAPENTAENOIC acid, *LINOLEIC acid
Abstract: Our aim was to investigate the associations between erythrocyte fatty acids and the risk of islet autoimmunity in children. The Environmental Determinants of Diabetes in the Young Study (TEDDY) is a longitudinal cohort study of children at high genetic risk for type 1 diabetes (n = 8676) born between 2004 and 2010 in the U.S., Finland, Sweden, and Germany. A nested case–control design comprised 398 cases with islet autoimmunity and 1178 sero-negative controls matched for clinical site, family history, and gender. Fatty acids composition was measured in erythrocytes collected at the age of 3, 6, and 12 months and then annually up to 6 years of age. Conditional logistic regression models were adjusted for HLA risk genotype, ancestry, and weight z-score. Higher eicosapentaenoic and docosapentaenoic acid (n − 3 polyunsaturated fatty acids) levels during infancy and conjugated linoleic acid after infancy were associated with a lower risk of islet autoimmunity. Furthermore, higher levels of some even-chain saturated (SFA) and monounsaturated fatty acids (MUFA) were associated with increased risk. Fatty acid status in early life may signal the risk for islet autoimmunity, especially n − 3 fatty acids may be protective, while increased levels of some SFAs and MUFAs may precede islet autoimmunity. [ABSTRACT FROM AUTHOR]
Published: 2021
Full Text: View/download PDF

17. Complex genetic dependencies among growth and neurological phenotypes in healthy children: Towards deciphering developmental mechanisms.

Author: Uechi, Lisa, Jalali, Mahjoubeh, Wilbur, Jayson D., French, Jonathan L., Jumbe, N. L., Meaney, Michael J., Gluckman, Peter D., Karnani, Neerja, Sakhanenko, Nikita A., and Galas, David J.
Subjects: *PHENOTYPES, *BIOLOGICAL networks, *DNA, *PLACENTA, *INFANTS
Abstract: The genetic mechanisms of childhood development in its many facets remain largely undeciphered. In the population of healthy infants studied in the Growing Up in Singapore Towards Healthy Outcomes (GUSTO) program, we have identified a range of dependencies among the observed phenotypes of fetal and early childhood growth, neurological development, and a number of genetic variants. We have quantified these dependencies using our information theory-based methods. The genetic variants show dependencies with single phenotypes as well as pleiotropic effects on more than one phenotype and thereby point to a large number of brain-specific and brain-expressed gene candidates. These dependencies provide a basis for connecting a range of variants with a spectrum of phenotypes (pleiotropy) as well as with each other. A broad survey of known regulatory expression characteristics, and other function-related information from the literature for these sets of candidate genes allowed us to assemble an integrated body of evidence, including a partial regulatory network, that points towards the biological basis of these general dependencies. Notable among the implicated loci are RAB11FIP4 (next to NF1), MTMR7 and PLD5, all highly expressed in the brain; DNMT1 (DNA methyl transferase), highly expressed in the placenta; and PPP1R12B and DMD (dystrophin), known to be important growth and development genes. While we cannot specify and decipher the mechanisms responsible for the phenotypes in this study, a number of connections for further investigation of fetal and early childhood growth and neurological development are indicated. These results and this approach open the door to new explorations of early human development. [ABSTRACT FROM AUTHOR]
Published: 2020
Full Text: View/download PDF

18. Article Expansion of the Kullback-Leibler Divergence, and a New Class of Information Metrics.

Author: Galas, David J., Dewey, Gregory, Kunert-Graf, James, and Sakhanenko, Nikita A.
Subjects: *PROBABILITY density function, *SYSTEM analysis, *INFORMATION processing, *INFORMATION theory, *DATA analysis
Abstract: Inferring and comparing complex, multivariable probability density functions is fundamental to problems in several fields, including probabilistic learning, network theory, and data analysis. Classification and prediction are the two faces of this class of problem. This study takes an approach that simplifies many aspects of these problems by presenting a structured, series expansion of the Kullback-Leibler divergence--a function central to information theory--and devise a distance metric based on this divergence. Using the Möbius inversion duality between multivariable entropies and multivariable interaction information, we express the divergence as an additive series in the number of interacting variables, which provides a restricted and simplified set of distributions to use as approximation and with which to model data. Truncations of this series yield approximations based on the number of interacting variables. The first few terms of the expansion-truncation are illustrated and shown to lead naturally to familiar approximations, including the well-known Kirkwood superposition approximation. Truncation can also induce a simple relation between the multi-information and the interaction information. A measure of distance between distributions, based on Kullback-Leibler divergence, is then described and shown to be a true metric if properly restricted. The expansion is shown to generate a hierarchy of metrics and connects this work to information geometry formalisms. An example of the application of these metrics to a graph comparison problem is given that shows that the formalism can be applied to a wide range of network problems and provides a general approach for systematic approximations in numbers of interactions or connections, as well as a related quantitative metric. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

19. An Evaluation of High-Throughput Approaches to QTL Mapping in Saccharomyces cerevisiae.

Author: Wilkening, Stefan, Gen Lin, Fritsch, Emilie S., Tekkedil, Manu M., Anders, Simon, Kuehn, Raquel, Nguyen, Michelle, Aiyar, Raeka S., Proctor, Michael, Sakhanenko, Nikita A., Galas, David J., Gagneur, Julien, Deutschbauer, Adam, and Steinmetz, Lars M.
Subjects: *SACCHAROMYCES cerevisiae, *MEIOSIS, *PHENOTYPES, *GENETIC polymorphism research, *GENETIC mutation
Abstract: Dissecting the molecular basis of quantitative traits is a significant challenge and is essential for understanding complex diseases. Even in model organisms, precisely determining causative genes and their interactions has remained elusive, due in part to difficulty in narrowing intervals to single genes and in detecting epistasis or linked quantitative trait loci. These difficulties are exacerbated by limitations in experimental design, such as low numbers of analyzed individuals or of polymorphisms between parental genomes. We address these challenges by applying three independent high-throughput approaches for QTL mapping to map the genetic variants underlying 11 phenotypes in two genetically distant Saccharomyces cerevisiae strains, namely (1) individual analysis of >700 meiotic segregants, (2) bulk segregant analysis, and (3) reciprocal hemizygosity scanning, a new genome-wide method that we developed. We reveal differences in the performance of each approach and, by combining them, identify eight polymorphic genes that affect eight different phenotypes: colony shape, flocculation, growth on two nonfermentable carbon sources, and resistance to two drugs, salt, and high temperature. Our results demonstrate the power of individual segregant analysis to dissect QTL and address the underestimated contribution of interactions between variants. We also reveal confounding factors like mutations and aneuploidy in pooled approaches, providing valuable lessons for future designs of complex trait mapping studies. [ABSTRACT FROM AUTHOR]
Published: 2014
Full Text: View/download PDF

20. A systems-biology approach to modular genetic complexity.

Author: Carter, Gregory W., Rush, Cynthia G., Uygun, Filiz, Sakhanenko, Nikita A., Galas, David J., and Galitski, Timothy
Subjects: *GENOTYPE-environment interaction, *GENETIC research, *HUMAN biology, *BIOLOGICAL variation, *BIOLOGICAL evolution
Abstract: Multiple high-throughput genetic interaction studies have provided substantial evidence of modularity in genetic interaction networks. However, the correspondence between these network modules and specific pathways of information flow is often ambiguous. Genetic interaction and molecular interaction analyses have not generated large-scale maps comprising multiple clearly delineated linear pathways. We seek to clarify the situation by discerning the difference between genetic modules and classical pathways. We review a method to optimize the discovery of biologically meaningful genetic modules based on a previously described context-dependent information measure to obtain maximally informative networks. We compare the results of this method with the established measures of network clustering and find that it balances global and local clustering information in networks. We further discuss the consequences for genetic interaction networks and propose a framework for the analysis of genetic modularity. [ABSTRACT FROM AUTHOR]
Published: 2010
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

20 results on '"Sakhanenko, Nikita A."'

1. ApoE Modifier Alleles for Alzheimer's Disease Discovered by Information Theory Dependency Measures: MIST Software Package.

2. Optimized permutation testing for information theoretic measures of multi-gene interactions.

3. Computational Inference Software for Tetrad Assembly from Randomly Arrayed Yeast Colonies.

4. The Information Content of Discrete Functions and Their Application in Genetic Data Analysis.

5. Complexity and Vulnerability Analysis of the C. Elegans Gap Junction Connectome.

6. Biological Data Analysis as an Information Theory Problem: Multivariable Dependence Measures and the Shadows Algorithm.

7. Describing the Complexity of Systems: Multivariable 'Set Complexity' and the Information Basis of Systems Biology.

8. Probabilistic Logic Methods and Some Applications to Biology and Medicine.

9. PREDICTIONS AND DIAGNOSTICS IN EXPERIMENTAL DATA USING SUPPORT VECTOR REGRESSION.

10. SHOCK PHYSICS DATA RECONSTRUCTION USING SUPPORT VECTOR REGRESSION.

11. Partial Information Decomposition and the Information Delta: A Geometric Unification Disentangling Non-Pairwise Information.

12. Symmetries among Multivariate Information Measures Explored Using Möbius Operators.

13. Multivariate Analysis of Data Sets with Missing Values: An Information Theory-Based Reliability Function.

14. Discovering Pair-Wise Genetic Interactions: An Information Theory-Based Approach.

15. Toward an Information Theory of Quantitative Genetics.

16. Children's erythrocyte fatty acids are associated with the risk of islet autoimmunity.

17. Complex genetic dependencies among growth and neurological phenotypes in healthy children: Towards deciphering developmental mechanisms.

18. Article Expansion of the Kullback-Leibler Divergence, and a New Class of Information Metrics.

19. An Evaluation of High-Throughput Approaches to QTL Mapping in Saccharomyces cerevisiae.

20. A systems-biology approach to modular genetic complexity.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

20 results on '"Sakhanenko, Nikita A."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources