1. Evaluation of cross-platform and interlaboratory concordance via consensus modelling of genomic measurements
- Author
-
Susan J. Clark, Stephen T Bradford, Clare Stirzaker, Helen Speirs, Shalima S. Nair, Ruth Pidsley, Hugh J. French, Tim J Peters, Wenjia Qu, Terence P. Speed, Katherine A. Giles, Jenny Z. Song, Aaron L. Statham, and Hilal Varinli
- Subjects
Statistics and Probability ,Microarray ,Computer science ,Gene Expression Array ,Locus (genetics) ,Genomics ,Computational biology ,Biochemistry ,Genome ,03 medical and health sciences ,Gene expression ,Cross-platform ,Humans ,Molecular Biology ,030304 developmental biology ,Oligonucleotide Array Sequence Analysis ,Protocol (science) ,0303 health sciences ,Genome, Human ,030302 biochemistry & molecular biology ,Computational Biology ,Gold standard (test) ,Methylation ,DNA Methylation ,Genome Analysis ,Original Papers ,Computer Science Applications ,Computational Mathematics ,Identification (information) ,Computational Theory and Mathematics ,DNA methylation ,Nucleic acid sequencing ,Human genome ,Whole genome bisulfite sequencing ,Software - Abstract
Motivation A synoptic view of the human genome benefits chiefly from the application of nucleic acid sequencing and microarray technologies. These platforms allow interrogation of patterns such as gene expression and DNA methylation at the vast majority of canonical loci, allowing granular insights and opportunities for validation of original findings. However, problems arise when validating against a “gold standard” measurement, since this immediately biases all subsequent measurements towards that particular technology or protocol. Since all genomic measurements are estimates, in the absence of a ”gold standard” we instead empirically assess the measurement precision and sensitivity of a large suite of genomic technologies via a consensus modelling method called the row-linear model. This method is an application of the American Society for Testing and Materials Standard E691 for assessing interlaboratory precision and sources of variability across multiple testing sites. Both cross-platform and cross-locus comparisons can be made across all common loci, allowing identification of technology- and locus-specific tendencies. Results We assess technologies including the Infinium MethylationEPIC BeadChip, whole genome bisulfite sequencing (WGBS), two different RNA-Seq protocols (PolyA+ and Ribo-Zero) and five different gene expression array platforms. Each technology thus is characterised herein, relative to the consensus. We showcase a number of applications of the row-linear model, including correlation with known interfering traits. We demonstrate a clear effect of cross-hybridisation on the sensitivity of Infinium methylation arrays. Additionally, we perform a true interlaboratory test on a set of samples interrogated on the same platform across twenty-one separate testing laboratories. Availability and implementation A full implementation of the row-linear model, plus extra functions for visualisation, are found in the R package consensus at https://github.com/timpeters82/consensus. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2018