25 results on '"Genetics--Data processing"'
Search Results
2. Evolution's Clinical Guidebook : Translating Ancient Genes Into Precision Medicine
- Author
-
Jules J. Berman and Jules J. Berman
- Subjects
- Precision medicine, Evolution (Biology), Genetics--Data processing, Bioinformatics
- Abstract
Evolution's Clinical Guidebook: Translating Ancient Genes into Precision Medicine demonstrates, through well-documented examples, how an understanding of the phylogenetic ancestry of humans allows us to make sense out of the flood of genetic data streaming from modern laboratories and how it can lead us to new ways to prevent, diagnose and treat diseases. Topics cover evolution and human genome, meiosis and other recombinants events, embryology, speciation, phylogeny, rare and common diseases, and the evolution of aging. This book is a valuable source for bioinformaticians and those in the biomedical field who need knowledge, down to gene level, to fully comprehend currently available data. Offers an innovative approach, focusing on how disease-associated pathways evolved Explains how the fields of phylogeny and embryology have become closely tied to the fields of genetics and bioinformatics Demonstrates how students and biomedical professionals can apply the knowledge obtained in this book to the theory and practice of precision medicine
- Published
- 2019
3. Bioinformatics and Functional Genomics
- Author
-
Jonathan Pevsner and Jonathan Pevsner
- Subjects
- Genetics--Data processing, Genetics--Technique, Genomics, Bioinformatics, Proteomics
- Abstract
The bestselling introduction to bioinformatics and genomics – now in its third edition Widely received in its previous editions, Bioinformatics and Functional Genomics offers the most broad-based introduction to this explosive new discipline. Now in a thoroughly updated and expanded third edition, it continues to be the go-to source for students and professionals involved in biomedical research. This book provides up-to-the-minute coverage of the fields of bioinformatics and genomics. Features new to this edition include: Extensive revisions and a slight reorder of chapters for a more effective organization A brand new chapter on next-generation sequencing An expanded companion website, also updated as and when new information becomes available Greater emphasis on a computational approach, with clear guidance of how software tools work and introductions to the use of command-line tools such as software for next-generation sequence analysis, the R programming language, and NCBI search utilities The book is complemented by lavish illustrations and more than 500 figures and tables - many newly-created for the third edition to enhance clarity and understanding. Each chapter includes learning objectives, a problem set, pitfalls section, boxes explaining key techniques and mathematics/statistics principles, a summary, recommended reading, and a list of freely available software. Readers may visit a related Web page for supplemental information such as PowerPoints and audiovisual files of lectures, and videocasts of how to perform many basic operations: www.wiley.com/go/pevsnerbioinformatics. Bioinformatics and Functional Genomics, Third Edition serves as an excellent single-source textbook for advanced undergraduate and beginning graduate-level courses in the biological sciences and computer sciences. It is also an indispensable resource for biologists in a broad variety of disciplines who use the tools of bioinformatics and genomics to study particular research problems; bioinformaticists and computer scientists who develop computer algorithms and databases; and medical researchers and clinicians who want to understand the genomic basis of viral, bacterial, parasitic, or other diseases.
- Published
- 2015
4. Gene Network Inference : Verification of Methods for Systems Genetics Data
- Author
-
Alberto Fuente and Alberto Fuente
- Subjects
- Computational biology, Algorithms, Bioinformatics, Genetics--Data processing, Systems biology, Computer algorithms
- Abstract
This book presents recent methods for Systems Genetics (SG) data analysis, applying them to a suite of simulated SG benchmark datasets. Each of the chapter authors received the same datasets to evaluate the performance of their method to better understand which algorithms are most useful for obtaining reliable models from SG datasets. The knowledge gained from this benchmarking study will ultimately allow these algorithms to be used with confidence for SG studies e.g. of complex human diseases or food crop improvement. The book is primarily intended for researchers with a background in the life sciences, not for computer scientists or statisticians.
- Published
- 2013
5. The longitudinal Israeli study of twins (LIST) - an integrative view of social development
- Author
-
Avinun, Reut and Knafo, Ariel
- Published
- 2013
6. Bioinformatics for Geneticists : A Bioinformatics Primer for the Analysis of Genetic Data
- Author
-
Michael R. Barnes and Michael R. Barnes
- Subjects
- Computational biology, Computer software, Genetics--Data processing, Bioinformatics, Computer programs
- Abstract
Praise from the reviews:'Without reservation, I endorse this text as the best resource I've encountered that neatly introduces and summarizes many points I've learned through years of experience. The gems of truth found in this book will serve well those who wish to apply bioinformatics in their daily work, as well as help them advise others in this capacity.'CIRCGENETICS'This book may really help to get geneticists and bioinformaticians on'speaking-terms'... contains some essential reading for almost any person working in the field of molecular genetics.'EUROPEAN JOURNAL OF HUMAN GENETICS'... an excellent resource... this book should ensure that any researcher's skill base is maintained.'GENETICAL RESEARCH “… one of the best available and most accessible texts on bioinformatics and genetics in the postgenome age… The writing is clear, with succinct subsections within each chapter….Without reservation, I endorse this text as the best resource I've encountered that neatly introduces and summarizes many points I've learned through years of experience. The gems of truth found in this book will serve well those who wish to apply bioinformatics in their daily work, as well as help them advise others in this capacity.” CIRCULATION: CARDIOVASCULAR GENETICS A fully revised version of the successful First Edition, this one-stop reference book enables all geneticists to improve the efficiency of their research. The study of human genetics is moving into a challenging new era. New technologies and data resources such as the HapMap are enabling genome-wide studies, which could potentially identify most common genetic determinants of human health, disease and drug response. With these tremendous new data resources at hand, more than ever care is required in their use. Faced with the sheer volume of genetics and genomic data, bioinformatics is essential to avoid drowning true signal in noise. Considering these challenges, Bioinformatics for Geneticists, Second Edition works at multiple levels: firstly, for the occasional user who simply wants to extract or analyse specific data; secondly, at the level of the advanced user providing explanations of how and why a tool works and how it can be used to greatest effect. Finally experts from fields allied to genetics give insight into the best genomics tools and data to enhance a genetic experiment. Hallmark Features of the Second Edition: Illustrates the value of bioinformatics as a constantly evolving avenue into novel approaches to study genetics The only book specifically addressing the bioinformatics needs of geneticists More than 50% of chapters are completely new contributions Dramatically revised content in core areas of gene and genomic characterisation, pathway analysis, SNP functional analysis and statistical genetics Focused on freely available tools and web-based approaches to bioinformatics analysis, suitable for novices and experienced researchers alike Bioinformatics for Geneticists, Second Edition describes the key bioinformatics and genetic analysis processes that are needed to identify human genetic determinants. The book is based upon the combined practical experience of domain experts from academic and industrial research environments and is of interest to a broad audience, including students, researchers and clinicians working in the human genetics domain.
- Published
- 2007
7. False Disease Region Identification from Identity-by-descent Haplotype Sharing in the Presence of Phenocopies
- Author
-
Macgregor, Stuart, Knott, Sara A, and Visscher, Peter M
- Published
- 2006
8. A Dynamic Query System for Supporting Phenotype Mining in Genetic Studies
- Author
-
Nuzzo, Angelo, Segagni, Daniele, Milani, Giuseppe, Rognoni, Carla, Bellazzi, Riccardo, and Medinfo 2007: Proceedings of the 12th World Congress on Health (Medical) Informatics; Building Sustainable Health Systems
- Published
- 2007
9. Genetic Databases : Socio-Ethical Issues in the Collection and Use of DNA
- Author
-
Oonagh Corrigan, Richard Tutton, Oonagh Corrigan, and Richard Tutton
- Subjects
- Genetics--Data processing, Databases--Citizen participation, Amino acid sequence--Databases, DNA, Nucleotide sequence--Databases, Computational biology, Bioinformatics, Medical ethics
- Abstract
Genetic Databases offers a timely analysis of the underlying tensions, contradictions and limitations of the current regulatory frameworks for, and policy debates about, genetic databases. Drawing on original empirical research and theoretical debates in the fields of sociology, anthropology and legal studies, the contributors to this book challenge the prevailing orthodoxy of informed consent and explore the relationship between personal privacy and the public good. They also consider the multiple meanings attached to human tissue and the role of public consultations and commercial involvement in the creation and use of genetic databases.The authors argue that policy and regulatory frameworks produce a representation of participation that is often at odds with the experiences and understandings of those taking part. The findings present a serious challenge for public policy to provide mechanisms to safeguard the welfare of individuals participating in genetic databases.
- Published
- 2004
10. Bioinformatics and Functional Genomics
- Author
-
Jonathan Pevsner and Jonathan Pevsner
- Subjects
- Human genome, Genetics--Technique, Bioinformatics, Genomics, Genetics--Data processing, Proteomics
- Abstract
Wiley is proud to announce the publication of the first ever broad-based textbook introduction to Bioinformatics and Functional Genomics by a trained biologist, experienced researcher, and award-winning instructor. In this new text, author Jonathan Pevsner, winner of the 2001 Johns Hopkins University'Teacher of the Year'award, explains problem-solving using bioinformatic approaches using real examples such as breast cancer, HIV-1, and retinal-binding protein throughout. His book includes 375 figures and over 170 tables. Each chapter includes: Problems, discussion of Pitfalls, Boxes explaining key techniques and math/stats principles, Summary, Recommended Reading list, and URLs for freely available software. The text is suitable for professionals and students at every level, including those with little to no background in computer science.
- Published
- 2003
11. Bioinformatics for Geneticists
- Author
-
Barnes, Michael R., Gray, Ian C., Barnes, Michael R., and Gray, Ian C.
- Subjects
- Computer software, Computer programs, Computational biology, Bioinformatics, Genetics--Data processing
- Abstract
Bioinformatics for Geneticists describes a step by step approach to key bioinformatics and genetic analysis procedures, based upon practical experience gained after many years of direct bioinformatics support for laboratory geneticists. It features detailed case studies of problems and analytical approaches that are specific to the needs of the genetics researcher.
- Published
- 2003
12. Bioinformatics: Searching for Stars
- Author
-
O'Neill, Graeme
- Published
- 2011
13. Bioinformatics : Sequence and genome analysis
- Author
-
Mount, David W. and Mount, David W.
- Subjects
- Computational biology, Amino acid sequence, Genetics--Data processing, Bioinformatics, Nucleotide sequence
- Abstract
The application of computational methods to DNA and protein science is a new and exciting development in biology. Bioinformatics: Sequence and Genome Analysis is a comprehensive introduction to this emerging field of study. The book has many unique and valuable features: It is written for any biologist who wants to understand methods of sequence and structure analysis and how the necessary computer programs work; Sequence alignment, structure prediction, phylogenetic and gene prediction, database searching, and genome analysis are clearly explained and amply illustrated; Underlying algorithms and assumptions are clearly explained for the non-specialist; Examples are presented in simple numerical terms rather than complex formulas and notation; Theoretical underpinnings are linked to biological problems and their solutions; Extensive tables provide descriptions and Web sources for a broad range of publicly available software; An associated Website (www.BioinformaticsOnline.org), accessible free of charge by book purchasers, provides links to Internet sources referred to in the text, as well as problem sets for classroom use, and other useful material not included in the text. Based on the author's extensive experience as a molecular geneticist and bioinformaticist at the University of Arizona, this is a uniquely educational book, ideal as a laboratory reference for investigators and also as teaching reference for graduate and undergraduate students studying this fast-changing discipline.
- Published
- 2001
14. Linkage Analysis Genome Scans with Tens of Thousands of SNPS Gives Systematically Higher Power than Traditional Microsatellite-based Approaches under the Null Hypothesis
- Author
-
Hiekkalinna, T, Perola, M, and Terwilliger, JD
- Published
- 2004
15. Neural Networks and Genome Informatics
- Author
-
C.H. Wu, J.W. McLarty, C.H. Wu, and J.W. McLarty
- Subjects
- Bioinformatics, Genomes, Genetics--Data processing, Neural networks (Computer science), Computational biology
- Abstract
This book is a comprehensive reference in the field of neural networks and genome informatics. The tutorial of neural network foundations introduces basic neural network technology and terminology. This is followed by an in-depth discussion of special system designs for building neural networks for genome informatics, and broad reviews and evaluations of current state-of-the-art methods in the field. This book concludes with a description of open research problems and future research directions.
- Published
- 2000
16. Genetic Databases
- Author
-
Martin J. Bishop and Martin J. Bishop
- Subjects
- Gene expression, Genetics--Computer network resources, Amino acid sequence, Nucleotide sequence--Databases, Genetics--Data processing, Amino acid sequence--Databases, Computer networks, Genetic code
- Abstract
Computer access is the only way to retrieve up-to-date sequences and this book shows researchers puzzled by the maze of URLs, sites, and searches how to use internet technology to find and analyze genetic data. The book describes the different types of databases, how to use a specific database to find a sequence that you need, and how to analyze the data to compare it with your own work. The content also covers sequence phenotype, mutation, and genetic linkage databases; simple repetitive DNA sequences; gene feature identification; and prediction of structure and function of proteins from sequence information. This book will be invaluable to those starting a career in life sciences research as well as to established researchers wishing to make full use of available resources. Describes a wide range of databases: DNA, RNA, protein, pathways, and gene expression Enables readers to access the information they need from databases on the web Includes a directory of URLs for easy reference Invaluable for those starting a career in life sciences research and also for established researchers wishing to make full use of available resources
- Published
- 1999
17. Mapping QTLs for HDL-C, LDL-C and Associated Proteins and Identification of Underlying Genetic Variation: A Meta-analysis of Four Genome Scans
- Author
-
Heijmans, BT, Putter, H, Beekman, M, Lakenberg, N, van der Wijk, HJ, Whitfield, JB, Frants, RR, DeFaire, U, O'Connor, DT, Pedersen, NL, Martin, NG, Boomsma, DI, and Slagboom, PE
- Published
- 2004
18. Topics in Signal Processing: applications in genomics and genetics
- Author
-
Elmas, Abdulkadir
- Subjects
Signal processing ,FOS: Computer and information sciences ,ComputingMethodologies_PATTERNRECOGNITION ,Genetics--Data processing ,Bioinformatics ,Electrical engineering ,Transcription factors ,Genomics--Data processing ,Signal processing--Statistical methods - Abstract
The information in genomic or genetic data is influenced by various complex processes and appropriate mathematical modeling is required for studying the underlying processes and the data. This dissertation focuses on the formulation of mathematical models for certain problems in genomics and genetics studies and the development of algorithms for proposing efficient solutions. A Bayesian approach for the transcription factor (TF) motif discovery is examined and the extensions are proposed to deal with many interdependent parameters of the TF-DNA binding. The problem is described by statistical terms and a sequential Monte Carlo sampling method is employed for the estimation of unknown parameters. In particular, a class-based resampling approach is applied for the accurate estimation of a set of intrinsic properties of the DNA binding sites. Through statistical analysis of the gene expressions, a motif-based computational approach is developed for the inference of novel regulatory networks in a given bacterial genome. To deal with high false-discovery rates in the genome-wide TF binding predictions, the discriminative learning approaches are examined in the context of sequence classification, and a novel mathematical model is introduced to the family of kernel-based Support Vector Machines classifiers. Furthermore, the problem of haplotype phasing is examined based on the genetic data obtained from cost-effective genotyping technologies. Based on the identification and augmentation of a small and relatively more informative genotype set, a sparse dictionary selection algorithm is developed to infer the haplotype pairs for the sampled population. In a relevant context, to detect redundant information in the single nucleotide polymorphism (SNP) sites, the problem of representative (tag) SNP selection is introduced. An information theoretic heuristic is designed for the accurate selection of tag SNPs that capture the genetic diversity in a large sample set from multiple populations. The method is based on a multi-locus mutual information measure, reflecting a biological principle in the population genetics that is linkage disequilibrium.
- Published
- 2016
- Full Text
- View/download PDF
19. A method to identify differential expression profiles of time-course gene data with Fourier transformation
- Author
-
R.T. Ogden, Haseong Kim, and Jaehee Kim
- Subjects
FOS: Computer and information sciences ,False discovery rate ,Time Factors ,Bioinformatics ,Gene expression--Data processing ,Saccharomyces cerevisiae ,Computational biology ,Biology ,Sensitivity and Specificity ,Biochemistry ,symbols.namesake ,Structural Biology ,Genetics ,Cluster Analysis ,Cluster analysis ,Molecular Biology ,Fourier series ,Oligonucleotide Array Sequence Analysis ,Fourier Analysis ,Models, Genetic ,Gene Expression Profiling ,Applied Mathematics ,Cell Cycle ,Autocorrelation ,Computational Biology ,Computer Science Applications ,Genetics--Data processing ,Fourier transform ,Fourier analysis ,FOS: Biological sciences ,symbols ,Gene expression ,DNA microarray ,Functional genomics ,Research Article - Abstract
Background: Time course gene expression experiments are an increasingly popular method for exploring biological processes. Temporal gene expression profiles provide an important characterization of gene function, as biological systems are both developmental and dynamic. With such data it is possible to study gene expression changes over time and thereby to detect differential genes. Much of the early work on analyzing time series expression data relied on methods developed originally for static data and thus there is a need for improved methodology. Since time series expression is a temporal process, its unique features such as autocorrelation between successive points should be incorporated into the analysis. Results: This work aims to identify genes that show different gene expression profiles across time. We propose a statistical procedure to discover gene groups with similar profiles using a nonparametric representation that accounts for the autocorrelation in the data. In particular, we first represent each profile in terms of a Fourier basis, and then we screen out genes that are not differentially expressed based on the Fourier coefficients. Finally, we cluster the remaining gene profiles using a model-based approach in the Fourier domain. We evaluate the screening results in terms of sensitivity, specificity, FDR and FNR, compare with the Gaussian process regression screening in a simulation study and illustrate the results by application to yeast cell-cycle microarray expression data with alpha-factor synchronization. The key elements of the proposed methodology: (i) representation of gene profiles in the Fourier domain; (ii) automatic screening of genes based on the Fourier coefficients and taking into account autocorrelation in the data, while controlling the false discovery rate (FDR); (iii) model-based clustering of the remaining gene profiles. Conclusions: Using this method, we identified a set of cell-cycle-regulated time-course yeast genes. The proposed method is general and can be potentially used to identify genes which have the same patterns or biological processes, and help facing the present and forthcoming challenges of data analysis in functional genomics.
- Published
- 2013
20. Quantifying recent variation and relatedness in human populations
- Author
-
Gusev, Alexander
- Subjects
Genetics--Data processing ,Human genetics--Data processing ,Population genetics ,FOS: Biological sciences ,Population ,Genetics ,Human population genetics ,Computer science - Abstract
Advances in the genetic analysis of humans have revealed a surprising abundance of local relatedness between purportedly unrelated individuals. Where common mutations classically inform us of ancient relationships, such segments of pairwise identical by descent (IBD) sharing from a common ancestor are the observable traces of recent inter-mating. Combining these two distinct sources of information can help disentangle the complex genetic structure and flux in human populations. When considered together with a heritable trait, the segments can also be used to interrogate unascertained rare variation and help in locating trait-effecting loci. This work presents methods for comprehensive analysis of population-wide IBD and explores applications to disease and the understanding of recent genetic variation. We propose several strategies for efficient detection of IBD segments in population genotype data. Our novel seed-based algorithm, GERMLINE, can reduce the computational burden of finding pairwise segments from quadratic to nearly linear time in a general population. We demonstrate that this approach is several orders of magnitude faster than the available all-pairs methods while maintaining higher accuracy. Next, we extended the GERMLINE technique to process cohorts of unlimited size by adaptively adjusting the search mechanism to meet resource restrictions. We confirm its effectiveness with an analysis of 50,000 individuals where contemporary methods can only process a few thousand. One draw-back of these two algorithms is the dependence on phased haplotype data as input - a constraint that becomes more difficult with large populations. We propose a solution to this problem with an algorithm that analyzes genotype data directly by exploring all potential haplotypes and scoring each putative segment based on linkage-disequilibrium. This solution significantly outperforms available methods when applied to full sequence data and is computationally efficient enough to analyze thousands of sequenced genomes where current methods can only determine haplotypes for several hundred. Secondly, we outline two algorithms for analyzing available IBD segments to increase our understanding of rare variation and complex disease. Motivated by whole-genome sequencing, we present the INFOSTIP algorithm, which uses IBD segments to optimize the selection of individuals for complete population ascertainment. In simulations, we show that INFOSTIP selection can significantly increase variant inference accuracy over random sampling and posit inference of 60% of an isolated population from 1% optimally selected individuals. Seeking to move beyond pairwise IBD segment analysis, we describe the DASH algorithm, which groups shared segments into IBD "clusters" that are likely to be commonly co-inherited and uses them as proxies for un-typed variation. In simulated disease studies, we show this reference-free approach to be much more powerful for detecting rare causal variants than either traditional single-marker analysis or imputation from a general reference panel. Applying the DASH algorithm to disease traits from different populations, we identify multiple novel loci of association. Together, these novel techniques integrate the power of population and disease genetics.
- Published
- 2012
- Full Text
- View/download PDF
21. Bayesian approach for two model-selection-related bioinformatics problems.
- Abstract
在貝葉斯推理框架下,貝葉斯方法可以通過數據推斷複雜概率模型中的參數和結構。它被廣泛應用於多个領域。對於生物信息學問題,貝葉斯方法同樣也是一個理想的方法。本文通過介紹新的貝葉斯模型和計算方法討論並解決了兩個與模型選擇相關的生物信息學問題。, 第一個問題是關於在DNA 序列中的模式識別的相關研究。串聯重複序列片段在DNA 序列中經常出現。它對於基因組進化和人類疾病的研究非常重要。在這一部分,本文主要討論不確定數目的同一模式的串聯重複序列彌散分佈在同一個序列中的情況。我們首先對串聯重複序列片段構建概率模型。然後利用馬爾可夫鏈蒙特卡羅算法探索後驗分佈進而推斷出串聯重複序列的重複片段的模式矩陣和位置。此外,利用RJMCMC 算法解決由不確定數目的重複片段引起的模型選擇問題。, 另一個問題是對於生物分子的構象轉換的分析。一組生物分子的構象可被分成幾個不同的亞穩定狀態。由於生物分子的功能和構象之間的固有聯繫,構象轉變在不同的生物分子的生物過程中都扮演者非常重要的角色。一般我們從分子動力學模擬中可以得到構象轉換的數據。基於從分子動力學模擬中得到的微觀狀態水準上的構象轉換資訊,我們利用貝葉斯方法研究從微觀狀態到可變數目的亞穩定狀態的聚合問題。, 本文通過對以上兩個問題討論闡釋貝葉斯方法在生物信息學研究的多個方面具備優勢。這包括闡述生物問題的多變性,處理噪聲和失數據,以及解決模型選擇問題。, Bayesian approach is a powerful framework for inferring the parameters and structures of complicated probabilistic models from data. It is widely applied in many areas and also ideal for Bioinformatics problems due to their usually high complexity. In this thesis, new Bayesian models and computing methods are introduced to solve two Bioinformatics problems which are both related to model selection., The first problem is about the repeat pattern recognition. Tandem repeats occur frequently in DNA sequences. They are important for studying genome evolution and human disease. This thesis focuses on the case that an unknown number of tandem repeat segments of the same pattern are dispersively distributed in a sequence. A probabilistic generative model is introduced for the tandem repeats. Markov chain Monte Carlo algorithms are used to explore the posterior distribution as an effort to infer both the specific pattern of the tandem repeats and the location of repeat segments. Furthermore, reversible jump Markov chain Monte Carlo algorithms are used to address the transdimensional model selection problem raised by the variable number of repeat segments., The second part of this thesis is engaged in the conformational transitions of biomolecules. Because the function of a biological biomolecule is inherently related to its variable conformations which can be grouped into a set of metastable or long-live states, conformational transitions are important in biological processes. The 3D structure changes are generally simulated from the molecular dynamics computer simulation. Based on the conformational transitions on microstate level from molecular dynamics simulation, a Bayesian approach is developed to cluster the microstates into an uncertainty number of metastable that induces the model selection problem., With these two problems, this thesis shows that the Bayesian approach for bioinformatics problems has its advantages in terms of taking account of the inherent uncertainty in biological data, handling noisy or missing data, and dealing with the model selection problem., Detailed summary in vernacular field only., Liang, Tong., Thesis (Ph.D.)--Chinese University of Hong Kong, 2013., Includes bibliographical references (leaves 120-130)., Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web., s also in Chinese., p.i, Acknowledgement --- p.iv, Chapter 1 --- Introduction --- p.1, Chapter 1.1 --- Motivation --- p.1, Chapter 1.2 --- Statistical Background --- p.2, Chapter 1.3 --- Tandem Repeats --- p.4, Chapter 1.4 --- Conformational Space --- p.5, Chapter 1.5 --- Outlines --- p.7, Chapter 2 --- Preliminaries --- p.9, Chapter 2.1 --- Bayesian Inference --- p.9, Chapter 2.2 --- Markov chain Monte Carlo --- p.10, Chapter 2.2.1 --- Gibbs sampling --- p.11, Chapter 2.2.2 --- Metropolis - Hastings algorithm --- p.12, Chapter 2.2.3 --- Reversible Jump MCMC --- p.12, Chapter 3 --- Detection of Dispersed Short Tandem Repeats Using Reversible Jump MCMC --- p.14, Chapter 3.1 --- Background --- p.14, Chapter 3.2 --- Generative Model --- p.17, Chapter 3.3 --- Statistical inference --- p.18, Chapter 3.3.1 --- Likelihood --- p.19, Chapter 3.3.2 --- Prior Distributions --- p.19, Chapter 3.3.3 --- Sampling from Posterior Distribution via RJMCMC --- p.20, Chapter 3.3.4 --- Extra MCMC moves for better mixing --- p.26, Chapter 3.3.5 --- The complete algorithm --- p.29, Chapter 3.4 --- Experiments --- p.29, Chapter 3.4.1 --- Evaluation and comparison of the two RJMCMC versions using synthetic data --- p.30, Chapter 3.4.2 --- Comparison with existing methods using synthetic data --- p.33, Chapter 3.4.3 --- Sensitivity to Priors --- p.43, Chapter 3.4.4 --- Real data experiment --- p.45, Chapter 3.5 --- Discussion --- p.50, Chapter 4 --- A Probabilistic Clustering Algorithm for Conformational Changes of Biomolecules --- p.53, Chapter 4.1 --- Introduction --- p.53, Chapter 4.1.1 --- Molecular dynamic simulation --- p.54, Chapter 4.1.2 --- Hierarchical Conformational Space --- p.55, Chapter 4.1.3 --- Clustering Algorithms --- p.56, Chapter 4.2 --- Generative Model --- p.58, Chapter 4.2.1 --- Model 1: Vanilla Model --- p.59, Chapter 4.2.2 --- Model 2: Zero-Inflated Model --- p.60, Chapter 4.2.3 --- Model 3: Constrained Model --- p.61, Chapter 4.2.4 --- Model 4: Constrained and Zero-Inflated Model --- p.61, Chapter 4.3 --- Statistical Inference for Vanilla Model --- p.62, Chapter 4.3.1 --- Priors --- p.62, Chapter 4.3.2 --- Posterior distribution --- p.63, Chapter 4.3.3 --- Collapsed Gibbs for Vanilla Model with a Fixed Number of Clusters --- p.63, Chapter 4.3.4 --- Inference on the Number of Clusters --- p.65, Chapter 4.3.5 --- Synthetic Data Study --- p.68, Chapter 4.4 --- Statistical Inference for Zero-Inflated Model --- p.76, Chapter 4.4.1 --- Method 1 --- p.78, Chapter 4.4.2 --- Method 2 --- p.81, Chapter 4.4.3 --- Synthetic Data Study --- p.84, Chapter 4.5 --- Statistical Inference for Constrained Model --- p.85, Chapter 4.5.1 --- Priors --- p.85, Chapter 4.5.2 --- Posterior Distribution --- p.86, Chapter 4.5.3 --- Collapsed Posterior Distribution --- p.86, Chapter 4.5.4 --- Updating for Cluster Labels K --- p.89, Chapter 4.5.5 --- Updating for Constrained Λ from Truncated Distribution --- p.89, Chapter 4.5.6 --- Updating the Number of Clusters --- p.91, Chapter 4.5.7 --- Uniform Background Parameters on Λ --- p.92, Chapter 4.6 --- Real Data Experiments --- p.93, Chapter 4.7 --- Discussion --- p.104, Chapter 5 --- Conclusion and FutureWork --- p.107, Chapter A --- Appendix --- p.109, Chapter A.1 --- Post-processing for indel treatment --- p.109, Chapter A.2 --- Consistency Score --- p.111, Chapter A.3 --- A Proof for Collapsed Posterior distribution in Constrained Model in Chapter 4 --- p.111, Chapter A.4 --- Estimated Transition Matrices for Alanine Dipeptide by Chodera et al. (2006) --- p.117, Bibliography --- p.120, http://library.cuhk.edu.hk/record=b5549716, Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)
- Published
- 2013
22. PhenoGO: an integrated resource for the multiscale mining of clinical and biological data
- Author
-
Sam, Lee, Mendonça, Eneida, Li, Jianrong, Blake, Judith, Friedman, Carol, and Lussier, Yves
- Subjects
FOS: Computer and information sciences ,Genetics--Data processing ,Bioinformatics ,Genetics—Research - Abstract
The evolving complexity of genome-scale experiments has increasingly centralized the role of a highly computable, accurate, and comprehensive resource spanning multiple biological scales and viewpoints. To provide a resource to meet this need, we have significantly extended the PhenoGO database with gene-disease specific annotations and included an additional ten species. This a computationally-derived resource is primarily intended to provide phenotypic context (cell type, tissue, organ, and disease) for mining existing associations between gene products and GO terms specified in the Gene Ontology Databases Automated natural language processing (BioMedLEE) and computational ontology (PhenOS) methods were used to derive these relationships from the literature, expanding the database with information from ten additional species to include over 600,000 phenotypic contexts spanning eleven species from five GO annotation databases. A comprehensive evaluation evaluating the mappings (n = 300) found precision (positive predictive value) at 85%, and recall (sensitivity) at 76%. Phenotypes are encoded in general purpose ontologies such as Cell Ontology, the Unified Medical Language System, and in specialized ontologies such as the Mouse Anatomy and the Mammalian Phenotype Ontology. A web portal has also been developed, allowing for advanced filtering and querying of the database as well as download of the entire dataset http://www.phenogo.org.
- Published
- 2009
- Full Text
- View/download PDF
23. Full Bayesian boolean network inference based on Markov chain Monte Carlo algorithms.
- Abstract
在生物信息學中, 基因調控網絡推斷不斷受到人們的重視。各種不同的網絡模型被用來描述基因之間的調控關係, 其中包括布爾網絡, 概率布爾網絡, 貝葉斯網絡等。本文主要是討論基於數據的布爾網絡推斷。現在已經有很多方法來推斷節點是離散變量的網絡結構。比如REVEAL算法,Best Fit Extension 算法是兩種比較受歡迎的推斷網絡結構方法。並且他們在網絡的節點數目不是很多的情況下有很好的表現。然而, 現今很多方法對噪音和模型的不確定性沒有足夠的考慮。這也使得這些方法在實際應用中的表現不是很令人滿意。本文中, 我們用完全貝葉斯的方法去研究概率布爾網絡空間。在給定樣本的情況下, 我們提出了一種新的基於馬爾科夫鏈蒙特卡羅的算法。這種算法使得不同的網絡模型根據他們的後驗概率在整個網絡空間中跳動。為使得網絡模型能更好地在不同模型中轉換,我們把局部小網絡根據他們的可能性分配給他們相應的概率值。這些可能的局部小網絡是在數據前期處理中通過卡方檢驗得到的。和其他同類方法一樣, 雖然我們的方法也同樣面臨著在一個很大的網絡空間中搜索的難題, 但我們的方法能達到一個更高的推斷精度。同時,我們的方法所對應的計算量也是在可接收範圍之內。, In bioinformatics, the gene regulatory network inference is gaining intensive attention nowadays. Various network models have been used to describe gene regulatory relationships, including deterministic Boolean networks, probabilistic Boolean networks, Bayesian networks, etc. This dissertation is focused on data-based Boolean network reconstruction. Many methods have been proposed to infer this discrete network structure. For example, the REVEAL algorithm and the Best-Fit Extension method are popular and perform well for the networks with limited total number of nodes. However, existing methods didn't take full consideration of the ubiquitous noise across the network and the structure uncertainty, which makes these algorithms unsatisfactory in real applications. In this dissertation, we use a full Bayesian approach to explore the space of probabilistic Boolean networks. To compare the relative fitness of networks to the input data, we design novel Markov chain Monte Carlo algorithms to jump among con rained networks according to the joint posterior probability. To facilitate the transdimensional move, high proposing probabilities are assigned to more likely subnetwork models as judged by chi-square tests in the preprocessing step. Although faced with the same difficulty of searching in a huge structure space as other methods, our algorithm is expected to reconstruct the Boolean network in a more accurate and comprehensive manner with a bearable computing cost., Detailed summary in vernacular field only., Han, Shengtong., Thesis (Ph.D.)--Chinese University of Hong Kong, 2012., Includes bibliographical references (leaves 94-105)., also in Chinese., Chapter 1 --- Introduction --- p.1, Chapter 2 --- Technical Background --- p.5, Chapter 2.1 --- Classical Boolean Network --- p.5, Chapter 2.1.1 --- Definition --- p.5, Chapter 2.1.2 --- Dynamic Properties --- p.8, Chapter 2.2 --- Probabilistic Boolean Network --- p.9, Chapter 2.2.1 --- Definition --- p.9, Chapter 2.2.2 --- Dynamic Properties --- p.11, Chapter 3 --- Bayesian Framework for Boolean Network Modeling --- p.12, Chapter 3.1 --- Introduction --- p.12, Chapter 3.2 --- Network Modeling --- p.15, Chapter 3.2.1 --- Subnetwork Modeling --- p.15, Chapter 3.2.2 --- Full Network Modeling --- p.21, Chapter 3.2.3 --- Prior & Posterior Distributions --- p.23, Chapter 4 --- Network Inference-MCMC --- p.29, Chapter 4.1 --- Introduction --- p.29, Chapter 4.2 --- Proposal Subnetwork Construction --- p.30, Chapter 4.3 --- Network Structure Updating --- p.33, Chapter 4.3.1 --- Individual Network Updating Moves --- p.33, Chapter 4.3.2 --- Overall Network Updating Procedure --- p.37, Chapter 4.3.3 --- The Core Metroplis-Hasting Algorithm --- p.37, Chapter 4.4 --- Convergence Diagnostic --- p.40, Chapter 4.5 --- Model Selection --- p.41, Chapter 4.5.1 --- AIC, BIC --- p.42, Chapter 4.5.2 --- Bayes Factor --- p.42, Chapter 4.5.3 --- Reversible Jump MCMC --- p.43, Chapter 4.5.4 --- Bayesian Model Averaging --- p.45, Chapter 4.6 --- Computational Consideration --- p.46, Chapter 5 --- Numerical Studies --- p.49, Chapter 5.1 --- Simulation Studies --- p.49, Chapter 5.1.1 --- Simulation for Synthetic Network Models with Small Number of Nodes --- p.50, Chapter 5.1.2 --- Simulation for Synthetic Network Models with Large Number of Nodes --- p.64, Chapter 5.2 --- Comparison with Other Methods --- p.68, Chapter 5.2.1 --- Comparison Results --- p.71, Chapter 5.2.2 --- Discussion --- p.72, Chapter 6 --- Real Data Analysis --- p.74, Chapter 6.1 --- A Real Cell Cycle Network --- p.74, Chapter 6.2 --- Inference Result --- p.76, Chapter 6.3 --- Discussion --- p.79, Chapter 7 --- Summary and Discussion --- p.80, Bibliography --- p.83, Chapter A --- Data Pre-processing --- p.83, Chapter A.1 --- Data Discretization --- p.83, Chapter B --- Truth Tables for Commonly Used Basic Logic Functions --- p.85, Chapter C --- All Distribution Tables for Gene Pairs and Gene Triplets --- p.86, Chapter C.1 --- Distribution Assumptions for Input Gene Pairs --- p.86, Chapter C.2 --- Distribution Assumptions for Gene Triplets --- p.87, Chapter D --- Pseudo Code of the Algorithm --- p.91, Chapter D.1 --- Case 1: In-degree=1 --- p.91, Chapter D.2 --- Case 2: In-degree=2 --- p.93, Chapter D.3 --- Case 3: In-degree=0 --- p.93, http://library.cuhk.edu.hk/record=b5549487, Use of this resource is governed by the terms and conditions of the Creative Commons “Attribution-NonCommercial-NoDerivatives 4.0 International” License (http://creativecommons.org/licenses/by-nc-nd/4.0/)
- Published
- 2012
24. Construction of the model for the Genetic Analysis Workshop 14 simulated data
- Author
-
Greenberg, David, Zhang, Junying, Shmulewitz, Dvora, Strug, Lisa J., Zimmerman, Regina, Singh, Veena, and Marathe, Sudhir
- Subjects
Genetics--Data processing ,Biometry ,FOS: Biological sciences ,Linkage (Genetics) ,Genetics ,Gene mapping - Abstract
The Genetic Analysis Workshop 14 simulated dataset was designed 1) To test the ability to find genes related to a complex disease (such as alcoholism). Such a disease may be given a variety of definitions by different investigators, have associated endophenotypes that are common in the general population, and is likely to be not one disease but a heterogeneous collection of clinically similar, but genetically distinct, entities. 2) To observe the effect on genetic analysis and gene discovery of a complex set of gene × gene interactions. 3) To allow comparison of microsatellite vs. large-scale single-nucleotide polymorphism (SNP) data. 4) To allow testing of association to identify the disease gene and the effect of moderate marker × marker linkage disequilibrium. 5) To observe the effect of different ascertainment/disease definition schemes on the analysis. Data was distributed in two forms. Data distributed to participants contained about 1,000 SNPs and 400 microsatellite markers. Internet-obtainable data consisted of a finer 10,000 SNP map, which also contained data on controls. While disease characteristics and parameters were constant, four "studies" used varying ascertainment schemes based on differing beliefs about disease characteristics. One of the studies contained multiplex two- and three-generation pedigrees with at least four affected members. The simulated disease was a psychiatric condition with many associated behaviors (endophenotypes), almost all of which were genetic in origin. The underlying disease model contained four major genes and two modifier genes. The four major genes interacted with each other to produce three different phenotypes, which were themselves heterogeneous. The population parameters were calibrated so that the major genes could be discovered by linkage analysis in most datasets. The association evidence was more difficult to calibrate but was designed to find statistically significant association in 50% of datasets. We also simulated some marker × marker linkage disequilibrium around some of the genes and also in areas without disease genes. We tried two different methods to simulate the linkage disequilibrium.
- Published
- 2005
- Full Text
- View/download PDF
25. OGA: an ontological tool of human phenotypes with genetic associations
- Author
-
Vishwesh P. Mokashi, David L. Hirschberg, Jesus Enrique Herrera-Galeano, and Jeffrey L. Solka
- Subjects
FOS: Computer and information sciences ,Genotype ,Bioinformatics ,Genome-wide association study ,Computational biology ,Biology ,Ontology--Methodology ,Genome ,Gene ,General Biochemistry, Genetics and Molecular Biology ,Association ,Gene mapping ,Genetic ,Genetics ,Humans ,Genetic Predisposition to Disease ,Genomes ,Genetic association ,Medicine(all) ,Genomes--Data processing ,Biochemistry, Genetics and Molecular Biology(all) ,business.industry ,Ontology ,Genome, Human ,Molecular Sequence Annotation ,General Medicine ,Phenotype ,Structured ,Genetics--Data processing ,Knowledge ,FOS: Biological sciences ,Human genome ,Personalized medicine ,business ,Software ,Research Article ,Genome-Wide Association Study - Abstract
Background: The availability of genetic data has increased dramatically in recent years. The greatest value of this data is its potential for personalized medicine. Many new associations are reported every day from Genome Wide Association Studies (GWAS). However, robust, reproducible associations are elusive for some complex diseases. Ontologies present a potential way to distinguish between spurious associations and those with a potential influence on the phenotype. Such an approach would be based on finding associations of the same genetic variant with closely related, but distinct, phenotypes. This approach can be accomplished with a phenotype ontology that also holds genetic association data. Results: Here, we report a structured knowledge application to navigate and to facilitate the discovery of relationships between different phenotypes and their genetic associations. Conclusions: OGA allows users to (1) find the intersecting set of genes for phenotypes of interest, (2) find empirical p values for such observations and (3) OGA outperforms similar applications in number of total concepts and genes mapped.
- Published
- 2013
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.