33 results on '"Andrian Yang"'
Search Results
2. StarmapVis: An interactive and narrative visualisation tool for single-cell and spatial data
- Author
-
Shichao Ma, Xiunan Fang, Yu Yao, Jianfu Li, Daniel C. Morgan, Yongyan Xia, Crystal S.M. Kwok, Michelle C.K. Lo, Dickson M.D. Siu, Kevin K. Tsia, Andrian Yang, and Joshua W.K. Ho
- Subjects
Web application ,Single-cell data visualisation ,Spatial-single cell integration ,Narrative visualisation ,Biotechnology ,TP248.13-248.65 - Abstract
Current single-cell visualisation techniques project high dimensional data into ‘map’ views to identify high-level structures such as cell clusters and trajectories. New tools are needed to allow the transversal through the high dimensionality of single-cell data to explore the single-cell local neighbourhood. StarmapVis is a convenient web application displaying an interactive downstream analysis of single-cell expression or spatial transcriptomic data. The concise user interface is powered by modern web browsers to explore the variety of viewing angles unavailable to 2D media. Interactive scatter plots display clustering information, while the trajectory and cross-comparison among different coordinates are displayed in connectivity networks. Automated animation of camera view is a unique feature of our tool. StarmapVis also offers a useful animated transition between two-dimensional spatial omic data to three-dimensional single cell coordinates. The usability of StarmapVis is demonstrated by four data sets, showcasing its practical usability. StarmapVis is available at: https://holab-hku.github.io/starmapVis.
- Published
- 2023
- Full Text
- View/download PDF
3. Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads [version 2; peer review: 2 approved]
- Author
-
Andrian Yang, Michael Troup, Joshua Y. S. Tang, and Joshua W. K. Ho
- Subjects
RNA-seq ,Read alignment ,Unaligned read ,Read recovery ,eng ,Medicine ,Science - Abstract
Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes.
- Published
- 2022
- Full Text
- View/download PDF
4. Cloud accelerated alignment and assembly of full-length single-cell RNA-seq data using Falco
- Author
-
Andrian Yang, Abhinav Kishore, Benjamin Phipps, and Joshua W. K. Ho
- Subjects
Single-cell RNA-seq ,Cloud computing ,Falco ,Alignment ,Transcript assembly ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Read alignment and transcript assembly are the core of RNA-seq analysis for transcript isoform discovery. Nonetheless, current tools are not designed to be scalable for analysis of full-length bulk or single cell RNA-seq (scRNA-seq) data. The previous version of our cloud-based tool Falco only focuses on RNA-seq read counting, but does not allow for more flexible steps such as alignment and read assembly. Results The Falco framework can harness the parallel and distributed computing environment in modern cloud platforms to accelerate read alignment and transcript assembly of full-length bulk RNA-seq and scRNA-seq data. There are two new modes in Falco: alignment-only and transcript assembly. In the alignment-only mode, Falco can speed up the alignment process by 2.5–16.4x based on two public scRNA-seq datasets when compared to alignment on a highly optimised standalone computer. Furthermore, it also provides a 10x average speed-up compared to alignment using published cloud-enabled tool for read alignment, Rail-RNA. In the transcript assembly mode, Falco can speed up the transcript assembly process by 1.7–16.5x compared to performing transcript assembly on a highly optimised computer. Conclusion Falco is a significantly updated open source big data processing framework that enables scalable and accelerated alignment and assembly of full-length scRNA-seq data on the cloud. The source code can be found at https://github.com/VCCRI/Falco.
- Published
- 2019
- Full Text
- View/download PDF
5. Genetic screening reveals phospholipid metabolism as a key regulator of the biosynthesis of the redox-active lipid coenzyme Q
- Author
-
Anita Ayer, Daniel J. Fazakerley, Cacang Suarna, Ghassan J. Maghzal, Diba Sheipouri, Kevin J. Lee, Michelle C. Bradley, Lucía Fernández-del-Rio, Sergey Tumanov, Stephanie MY. Kong, Jelske N. van der Veen, Andrian Yang, Joshua W.K. Ho, Steven G. Clarke, David E. James, Ian W. Dawes, Dennis E. Vance, Catherine F. Clarke, René L. Jacobs, and Roland Stocker
- Subjects
Coenzyme Q ,Mitochondria ,PEMT ,Insulin resistance ,S-adenosylmethionine ,S-adenosylhomocysteine ,Medicine (General) ,R5-920 ,Biology (General) ,QH301-705.5 - Abstract
Mitochondrial energy production and function rely on optimal concentrations of the essential redox-active lipid, coenzyme Q (CoQ). CoQ deficiency results in mitochondrial dysfunction associated with increased mitochondrial oxidative stress and a range of pathologies. What drives CoQ deficiency in many of these pathologies is unknown, just as there currently is no effective therapeutic strategy to overcome CoQ deficiency in humans. To date, large-scale studies aimed at systematically interrogating endogenous systems that control CoQ biosynthesis and their potential utility to treat disease have not been carried out. Therefore, we developed a quantitative high-throughput method to determine CoQ concentrations in yeast cells. Applying this method to the Yeast Deletion Collection as a genome-wide screen, 30 genes not known previously to regulate cellular concentrations of CoQ were discovered. In combination with untargeted lipidomics and metabolomics, phosphatidylethanolamine N-methyltransferase (PEMT) deficiency was confirmed as a positive regulator of CoQ synthesis, the first identified to date. Mechanistically, PEMT deficiency alters mitochondrial concentrations of one-carbon metabolites, characterized by an increase in the S-adenosylmethionine to S-adenosylhomocysteine (SAM-to-SAH) ratio that reflects mitochondrial methylation capacity, drives CoQ synthesis, and is associated with a decrease in mitochondrial oxidative stress. The newly described regulatory pathway appears evolutionary conserved, as ablation of PEMT using antisense oligonucleotides increases mitochondrial CoQ in mouse-derived adipocytes that translates to improved glucose utilization by these cells, and protection of mice from high-fat diet-induced insulin resistance. Our studies reveal a previously unrecognized relationship between two spatially distinct lipid pathways with potential implications for the treatment of CoQ deficiencies, mitochondrial oxidative stress/dysfunction, and associated diseases.
- Published
- 2021
- Full Text
- View/download PDF
6. Scalability and Validation of Big Data Bioinformatics Software
- Author
-
Andrian Yang, Michael Troup, and Joshua W.K. Ho
- Subjects
Biotechnology ,TP248.13-248.65 - Abstract
This review examines two important aspects that are central to modern big data bioinformatics analysis – software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs. Nonetheless the surge of volume and variety of biological and biomedical data has posed new challenges. We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment. Validation of software is another important issue in big data bioinformatics that is often ignored. Software validation is the process of determining whether the program under test fulfils the task for which it was designed. Determining the correctness of the computational output of big data bioinformatics software is especially difficult due to the large input space and complex algorithms involved. We discuss how state-of-the-art software testing techniques that are based on the idea of multiple executions, such as metamorphic testing, can be used to implement an effective bioinformatics quality assurance strategy. We hope this review will raise awareness of these critical issues in bioinformatics.
- Published
- 2017
- Full Text
- View/download PDF
7. Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads [version 1; peer review: 2 approved]
- Author
-
Andrian Yang, Joshua Y. S. Tang, Michael Troup, and Joshua W. K. Ho
- Subjects
Medicine ,Science - Abstract
Read alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align all the reads which should have been aligned, a problem we termed as the false-negative non-alignment problem. Here we present Scavenger, a python-based bioinformatics pipeline for recovering unaligned reads using a novel mechanism in which a putative alignment location is discovered based on sequence similarity between aligned and unaligned reads. We showed that Scavenger could recover unaligned reads in a range of simulated and real RNA-seq datasets, including single-cell RNA-seq data. We found that recovered reads tend to contain more genetic variants with respect to the reference genome compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. Even when the number of recovered reads is relatively small compared to the total number of reads, the addition of these recovered reads can impact downstream analyses, especially in terms of estimating the expression and differential expression of lowly expressed genes, such as pseudogenes.
- Published
- 2019
- Full Text
- View/download PDF
8. How difficult is inference of mammalian causal gene regulatory networks?
- Author
-
Djordje Djordjevic, Andrian Yang, Armella Zadoorian, Kevin Rungrugeecharoen, and Joshua W K Ho
- Subjects
Medicine ,Science - Abstract
Gene regulatory networks (GRNs) play a central role in systems biology, especially in the study of mammalian organ development. One key question remains largely unanswered: Is it possible to infer mammalian causal GRNs using observable gene co-expression patterns alone? We assembled two mouse GRN datasets (embryonic tooth and heart) and matching microarray gene expression profiles to systematically investigate the difficulties of mammalian causal GRN inference. The GRNs were assembled based on > 2,000 pieces of experimental genetic perturbation evidence from manually reading > 150 primary research articles. Each piece of perturbation evidence records the qualitative change of the expression of one gene following knock-down or over-expression of another gene. Our data have thorough annotation of tissue types and embryonic stages, as well as the type of regulation (activation, inhibition and no effect), which uniquely allows us to estimate both sensitivity and specificity of the inference of tissue specific causal GRN edges. Using these unprecedented datasets, we found that gene co-expression does not reliably distinguish true positive from false positive interactions, making inference of GRN in mammalian development very difficult. Nonetheless, if we have expression profiling data from genetic or molecular perturbation experiments, such as gene knock-out or signalling stimulation, it is possible to use the set of differentially expressed genes to recover causal regulatory relationships with good sensitivity and specificity. Our result supports the importance of using perturbation experimental data in causal network reconstruction. Furthermore, we showed that causal gene regulatory relationship can be highly cell type or developmental stage specific, suggesting the importance of employing expression profiles from homogeneous cell populations. This study provides essential datasets and empirical evidence to guide the development of new GRN inference methods for mammalian organ development.
- Published
- 2014
- Full Text
- View/download PDF
9. Harnessing Multiple Source Test Cases in Metamorphic Testing: A Case Study in Bioinformatics.
- Author
-
Joshua Y. S. Tang, Andrian Yang, Tsong Yueh Chen, and Joshua Wing Kei Ho
- Published
- 2017
- Full Text
- View/download PDF
10. A cloud-based framework for applying metamorphic testing to a bioinformatics pipeline.
- Author
-
Michael Troup, Andrian Yang, Amir Hossein Kamali, Eleni Giannoulatou, Tsong Yueh Chen, and Joshua Wing Kei Ho
- Published
- 2016
- Full Text
- View/download PDF
11. Locus-specific expression of transposable elements in single cells with CELLO-seq
- Author
-
Neil Brockdorff, Aaron T. L. Lun, Joseph S Bowness, John C. Marioni, Daniel J. Gaffney, Rebecca V Berrens, Guocheng Lan, Florian Bieberich, Andrian Yang, Maria Imaz, Cheuk-Ting Law, and Christopher E. Laumer
- Subjects
Transposable element ,0303 health sciences ,Cell ,Biomedical Engineering ,food and beverages ,Bioengineering ,Locus (genetics) ,Computational biology ,Biology ,Applied Microbiology and Biotechnology ,03 medical and health sciences ,0302 clinical medicine ,medicine.anatomical_structure ,Expression (architecture) ,Complementary DNA ,medicine ,Molecular Medicine ,Human Induced Pluripotent Stem Cells ,030217 neurology & neurosurgery ,030304 developmental biology ,Biotechnology - Abstract
Transposable elements (TEs) regulate diverse biological processes, from early development to cancer. Expression of young TEs is difficult to measure with next-generation, single-cell sequencing technologies because their highly repetitive nature means that short complementary DNA reads cannot be unambiguously mapped to a specific locus. Single CELl LOng-read RNA-sequencing (CELLO-seq) combines long-read single cell RNA-sequencing with computational analyses to measure TE expression at unique loci. We used CELLO-seq to assess the widespread expression of TEs in two-cell mouse blastomeres as well as in human induced pluripotent stem cells. Across both species, old and young TEs showed evidence of locus-specific expression with simulations demonstrating that only a small number of very young elements in the mouse could not be mapped back to the reference with high confidence. Exploring the relationship between the expression of individual elements and putative regulators revealed large heterogeneity, with TEs within a class showing different patterns of correlation and suggesting distinct regulatory mechanisms.
- Published
- 2021
12. Inter-gastruloid heterogeneity revealed by single cell transcriptomics time course: implications for organoid based perturbation studies
- Author
-
Leah U. Rosen, L. Carine Stapel, Ricard Argelaguet, Charlie George Barker, Andrian Yang, Wolf Reik, and John C. Marioni
- Abstract
Recent advances in organoid and genome editing technologies are allowing for perturbation experiments at an unprecedented scale. However, before doing such experiments it is important to understand the gene expression profile in each of the organoid’s cells, as well as how much heterogeneity there is between individual organoids. Here we characterise an organoid model of mouse gastrulation called gastruloids using single cell RNA-sequencing of individual organoids at half-day intervals between day 3 and day 5 of differentiation (roughly corresponding to E6.5-E8.75 in vivo). Our study reveals multiple differentiation trajectories that have hitherto not been characterised in gastruloids. Intriguingly, we observe that individual gastruloids displayed a strong bias towards producing either mesodermal (largely somitic) or ectodermal (specifically neural) cell types. This bifurcation is already seen at the earliest sampled time point, and is characterised by increased activity of WNT-associated pathways in mesodermally-biased gastruloids as compared to neurally-biased gastruloids. Notably, at day 5, mesodermal gastruloids show an increase in the proportion of neural cells, while neural gastruloids do not produce notably more mesodermal cells. This is in line with previous studies on how the balance between these cell types is regulated. We demonstrate using in silico simulations that without proper understanding of the inter-organoid heterogeneity, perturbation experiments have either very high false positive or negative rates, depending on the statistical model used. Thus in future studies, modelling of inter-organoid heterogeneity will be crucial when designing organoid-based perturbation studies.HighlightsA single cell RNA-sequencing time course of day 3 to day 5 mouse gastruloids reveals multiple mesodermal and neural differentiation trajectories hitherto uncharacterised in gastruloidsSingle gastruloid, single cell RNA-sequencing of mouse gastruloids reveals that gastruloids are either mesodermally- or neurally-biasedThe two classes of gastruloid arise from differences in response strength to the WNT-agonist chironAt day 5, mesodermal gastruloids start making more neural cells, while neural gastruloids do not make more mesodermal cells, aligning with previously studied in vivo feedback loopsWe show using simulations that understanding interorganoid heterogeneity is a crucial consideration in the design and analysis of well-powered organoid-based perturbation studies
- Published
- 2022
13. CHDgene: A Curated Database for Congenital Heart Disease Genes
- Author
-
Andrian Yang, Dimuthu Alankarage, Hartmut Cuny, Eddie K.K. Ip, Moran Almog, Jessica Lu, Debjani Das, Annabelle Enriquez, Justin O. Szot, David T. Humphreys, Gillian M. Blue, Joshua W.K. Ho, David S. Winlaw, Sally L. Dunwoodie, and Eleni Giannoulatou
- Subjects
Heart Defects, Congenital ,Humans ,Exome ,General Medicine - Published
- 2022
14. A reference human induced pluripotent stem cell line for large-scale collaborative studies
- Author
-
Caroline B. Pantazis, Andrian Yang, Erika Lara, Justin A. McDonough, Cornelis Blauwendraat, Lirong Peng, Hideyuki Oguro, Jitendra Kanaujiya, Jizhong Zou, David Sebesta, Gretchen Pratt, Erin Cross, Jeffrey Blockwick, Philip Buxton, Lauren Kinner-Bibeau, Constance Medura, Christopher Tompkins, Stephen Hughes, Marianita Santiana, Faraz Faghri, Mike A. Nalls, Daniel Vitale, Shannon Ballard, Yue A. Qi, Daniel M. Ramos, Kailyn M. Anderson, Julia Stadler, Priyanka Narayan, Jason Papademetriou, Luke Reilly, Matthew P. Nelson, Sanya Aggarwal, Leah U. Rosen, Peter Kirwan, Venkat Pisupati, Steven L. Coon, Sonja W. Scholz, Theresa Priebe, Miriam Öttl, Jian Dong, Marieke Meijer, Lara J.M. Janssen, Vanessa S. Lourenco, Rik van der Kant, Dennis Crusius, Dominik Paquet, Ana-Caroline Raulin, Guojun Bu, Aaron Held, Brian J. Wainger, Rebecca M.C. Gabriele, Jackie M. Casey, Selina Wray, Dad Abu-Bonsrah, Clare L. Parish, Melinda S. Beccari, Don W. Cleveland, Emmy Li, Indigo V.L. Rose, Martin Kampmann, Carles Calatayud Aristoy, Patrik Verstreken, Laurin Heinrich, Max Y. Chen, Birgitt Schüle, Dan Dou, Erika L.F. Holzbaur, Maria Clara Zanellati, Richa Basundra, Mohanish Deshmukh, Sarah Cohen, Richa Khanna, Malavika Raman, Zachary S. Nevin, Madeline Matia, Jonas Van Lent, Vincent Timmerman, Bruce R. Conklin, Katherine Johnson Chase, Ke Zhang, Salome Funes, Daryl A. Bosco, Lena Erlebach, Marc Welzer, Deborah Kronenberg-Versteeg, Guochang Lyu, Ernest Arenas, Elena Coccia, Lily Sarrafha, Tim Ahfeldt, John C. Marioni, William C. Skarnes, Mark R. Cookson, Michael E. Ward, Florian T. Merkle, Human genetics, Amsterdam Neuroscience - Cellular & Molecular Mechanisms, Neurology, Merkle, Florian [0000-0002-8513-2998], Apollo - University of Cambridge Repository, Functional Genomics, and Amsterdam Neuroscience - Neurodegeneration
- Subjects
Gene Editing ,p53 ,iPSC ,Induced Pluripotent Stem Cells ,Cell Differentiation ,Cell Biology ,differentiation ,single-cell ,reference ,whole-genome ,karyotype ,stem cell ,pluripotent ,ddc:570 ,CRISPR ,Genetics ,Molecular Medicine ,Humans ,Biological Assay ,Human medicine ,Biology - Abstract
Human induced pluripotent stem cell (iPSC) lines are a powerful tool for studying development and disease, but the considerable phenotypic variation between lines makes it challenging to replicate key findings and integrate data across research groups. To address this issue, we sub-cloned candidate human iPSC lines and deeply characterized their genetic properties using whole genome sequencing, their genomic stability upon CRISPR-Cas9-based gene editing, and their phenotypic properties including differentiation to commonly used cell types. These studies identified KOLF2.1J as an all-around well-performing iPSC line. We then shared KOLF2.1J with groups around the world who tested its performance in head-to-head comparisons with their own preferred iPSC lines across a diverse range of differentiation protocols and functional assays. On the strength of these findings, we have made KOLF2.1J and its gene-edited derivative clones readily accessible to promote the standardization required for large-scale collaborative science in the stem cell field.
- Published
- 2022
15. Genetic screening reveals phospholipid metabolism as a key regulator of the biosynthesis of the redox-active lipid coenzyme Q
- Author
-
Roland Stocker, Lucía Fernández-del-Río, Andrian Yang, René L. Jacobs, Diba Sheipouri, Ghassan J. Maghzal, Jelske N. van der Veen, Cacang Suarna, Sergey Tumanov, Kevin J. Lee, Steven Clarke, Ian W. Dawes, Daniel J. Fazakerley, Dennis E. Vance, Anita Ayer, Joshua W. K. Ho, Catherine F. Clarke, Michelle C. Bradley, David E. James, Stephanie M Y Kong, Fazakerley, Daniel J [0000-0001-8241-2903], Tumanov, Sergey [0000-0002-0557-3153], Apollo - University of Cambridge Repository, and Fazakerley, Daniel [0000-0001-8241-2903]
- Subjects
Medicine (General) ,S-Adenosylmethionine ,Mitochondrial Diseases ,Ubiquinone ,Phosphatidylethanolamine N-Methyltransferase ,Clinical Biochemistry ,Mitochondrion ,Medical Biochemistry and Metabolomics ,medicine.disease_cause ,Biochemistry ,chemistry.chemical_compound ,Mice ,Biology (General) ,Phospholipids ,chemistry.chemical_classification ,0303 health sciences ,Chemistry ,030302 biochemistry & molecular biology ,food and beverages ,Pharmacology and Pharmaceutical Sciences ,3. Good health ,Cell biology ,Mitochondria ,5.1 Pharmaceuticals ,Development of treatments and therapeutic interventions ,Oxidation-Reduction ,Research Paper ,QH301-705.5 ,S-adenosylhomocysteine ,03 medical and health sciences ,R5-920 ,Metabolomics ,PEMT ,Biosynthesis ,Lipidomics ,medicine ,Genetics ,Animals ,Genetic Testing ,Metabolic and endocrine ,030304 developmental biology ,Nutrition ,Phosphatidylethanolamine ,Reactive oxygen species ,S-adenosylmethionine ,Organic Chemistry ,Coenzyme Q ,Insulin resistance ,Pemt ,Coenzyme Q – cytochrome c reductase ,Biochemistry and Cell Biology ,Oxidative stress - Abstract
Mitochondrial energy production and function rely on optimal concentrations of the essential redox-active lipid, coenzyme Q (CoQ). CoQ deficiency results in mitochondrial dysfunction associated with increased mitochondrial oxidative stress and a range of pathologies. What drives CoQ deficiency in many of these pathologies is unknown, just as there currently is no effective therapeutic strategy to overcome CoQ deficiency in humans. To date, large-scale studies aimed at systematically interrogating endogenous systems that control CoQ biosynthesis and their potential utility to treat disease have not been carried out. Therefore, we developed a quantitative high-throughput method to determine CoQ concentrations in yeast cells. Applying this method to the Yeast Deletion Collection as a genome-wide screen, 30 genes not known previously to regulate cellular concentrations of CoQ were discovered. In combination with untargeted lipidomics and metabolomics, phosphatidylethanolamine N-methyltransferase (PEMT) deficiency was confirmed as a positive regulator of CoQ synthesis, the first identified to date. Mechanistically, PEMT deficiency alters mitochondrial concentrations of one-carbon metabolites, characterized by an increase in the S-adenosylmethionine to S-adenosylhomocysteine (SAM-to-SAH) ratio that reflects mitochondrial methylation capacity, drives CoQ synthesis, and is associated with a decrease in mitochondrial oxidative stress. The newly described regulatory pathway appears evolutionary conserved, as ablation of PEMT using antisense oligonucleotides increases mitochondrial CoQ in mouse-derived adipocytes that translates to improved glucose utilization by these cells, and protection of mice from high-fat diet-induced insulin resistance. Our studies reveal a previously unrecognized relationship between two spatially distinct lipid pathways with potential implications for the treatment of CoQ deficiencies, mitochondrial oxidative stress/dysfunction, and associated diseases., Graphical abstract Image 1, Highlights • Mitochondrial CoQ deficiency results in oxidative stress and a range of pathologies • The drivers of mitochondrial CoQ deficiency remain largely unknown • PEMT deficiency is the first identified positive regulator of mitochondrial CoQ • PEMT deficiency increases CoQ by increasing the mitochondrial SAM-to-SAH ratio • PEMT deficiency prevents insulin resistance by increasing mitochondrial CoQ
- Published
- 2021
16. A reference induced pluripotent stem cell line for large-scale collaborative studies
- Author
-
Caroline B. Pantazis, Andrian Yang, Erika Lara, Justin A. McDonough, Cornelis Blauwendraat, Lirong Peng, Hideyuki Oguro, Jitendra Kanaujiya, Jizhong Zou, David Sebesta, Gretchen Pratt, Erin Cross, Jeffrey Blockwick, Philip Buxton, Lauren Kinner-Bibeau, Constance Medura, Christopher Tompkins, Stephen Hughes, Marianita Santiana, Faraz Faghri, Mike A. Nalls, Daniel Vitale, Shannon Ballard, Yue A. Qi, Daniel M. Ramos, Kailyn M. Anderson, Julia Stadler, Priyanka Narayan, Jason Papademetriou, Luke Reilly, Matthew P. Nelson, Sanya Aggarwal, Leah U. Rosen, Peter Kirwan, Venkat Pisupati, Steven L. Coon, Sonja W. Scholz, Theresa Priebe, Miriam Öttl, Jian Dong, Marieke Meijer, Lara J.M. Janssen, Vanessa S. Lourenco, Rik van der Kant, Dennis Crusius, Dominik Paquet, Ana-Caroline Raulin, Guojun Bu, Aaron Held, Brian J. Wainger, Rebecca M.C. Gabriele, Jackie M Casey, Selina Wray, Dad Abu-Bonsrah, Clare L. Parish, Melinda S. Beccari, Don W. Cleveland, Emmy Li, Indigo V.L. Rose, Martin Kampmann, Carles Calatayud Aristoy, Patrik Verstreken, Laurin Heinrich, Max Y. Chen, Birgitt Schüle, Dan Dou, Erika L.F. Holzbaur, Maria Clara Zanellati, Richa Basundra, Mohanish Deshmukh, Sarah Cohen, Richa Khanna, Malavika Raman, Zachary S. Nevin, Madeline Matia, Jonas Van Lent, Vincent Timmerman, Bruce R. Conklin, Katherine Johnson Chase, Ke Zhang, Salome Funes, Daryl A. Bosco, Lena Erlebach, Marc Welzer, Deborah Kronenberg-Versteeg, Guochang Lyu, Ernest Arenas, Elena Coccia, Lily Sarrafha, Tim Ahfeldt, John C. Marioni, William C. Skarnes, Mark R. Cookson, Michael E. Ward, and Florian T. Merkle
- Abstract
Human induced pluripotent stem cell (iPSC) lines are a powerful tool for studying development and disease, but the considerable phenotypic variation between lines makes it challenging to replicate key findings and integrate data across research groups. To address this issue, we sub-cloned candidate iPSC lines and deeply characterised their genetic properties using whole genome sequencing, their genomic stability upon CRISPR/Cas9-based gene editing, and their phenotypic properties including differentiation to commonly-used cell types. These studies identified KOLF2.1J as an all-around well-performing iPSC line. We then shared KOLF2.1J with groups around the world who tested its performance in head-to-head comparisons with their own preferred iPSC lines across a diverse range of differentiation protocols and functional assays. On the strength of these findings, we have made KOLF2.1J and hundreds of its gene-edited derivative clones readily accessible to promote the standardization required for large-scale collaborative science in the stem cell field.SummaryThe authors of this collaborative study deeply characterized human induced pluripotent stem cell (iPSC) lines to rationally select a clonally-derived cell line that performs well across multiple modalities. KOLF2.1J was identified as a candidate reference cell line based on single-cell analysis of its gene expression in the pluripotent state, whole genome sequencing, genomic stability after highly efficient CRISPR-mediated gene editing, integrity of the p53 pathway, and the efficiency with which it differentiated into multiple target cell populations. Since it is deeply characterized and can be readily acquired, KOLF2.1J is an attractive reference cell line for groups working with iPSCs.Graphical abstract
- Published
- 2021
- Full Text
- View/download PDF
17. Transposable element expression at unique loci in single cells with CELLO-seq
- Author
-
Daniel J. Gaffney, Andrian Yang, John C. Marioni, Christopher E. Laumer, G. Lan, M. Imaz, F. Bieberich, Aaron T. L. Lun, Rebecca V. Berrens, and C.-T. Law
- Subjects
Gene isoform ,Transposable element ,medicine.anatomical_structure ,Complementary DNA ,Cell ,medicine ,RNA ,Locus (genetics) ,Computational biology ,Allele ,Biology ,Gene - Abstract
The role of Transposable Elements (TEs) in regulating diverse biological processes, from early development to cancer, is becoming increasing appreciated. However, unlike other biological processes, next generation single-cell sequencing technologies are ill-suited for assaying TE expression: in particular, their highly repetitive nature means that short cDNA reads cannot be unambiguously mapped to a specific locus. Consequently, it is extremely challenging to understand the mechanisms by which TE expression is regulated and how they might themselves regulate other protein coding genes. To resolve this, we introduce CELLO-seq, a novel method and computational framework for performing long-read RNA sequencing at single cell resolution. CELLO-seq allows for full-length RNA sequencing and enables measurement of allelic, isoform and TE expression at unique loci. We use CELLO-seq to assess the widespread expression of TEs in 2-cell mouse blastomeres as well as human induced pluripotent stem cells (hiPSCs). Across both species, old and young TEs showed evidence of locus-specific expression, with simulations demonstrating that only a small number of very young elements in the mouse could not be mapped back to with high confidence. Exploring the relationship between the expression of individual elements and putative regulators revealed surprising heterogeneity, with TEs within a class showing different patterns of correlation, suggesting distinct regulatory mechanisms.
- Published
- 2020
18. Locus-specific expression of transposable elements in single cells with CELLO-seq
- Author
-
Rebecca V, Berrens, Andrian, Yang, Christopher E, Laumer, Aaron T L, Lun, Florian, Bieberich, Cheuk-Ting, Law, Guocheng, Lan, Maria, Imaz, Joseph S, Bowness, Neil, Brockdorff, Daniel J, Gaffney, and John C, Marioni
- Subjects
Mice ,Induced Pluripotent Stem Cells ,DNA Transposable Elements ,Animals ,Humans ,RNA - Abstract
Transposable elements (TEs) regulate diverse biological processes, from early development to cancer. Expression of young TEs is difficult to measure with next-generation, single-cell sequencing technologies because their highly repetitive nature means that short complementary DNA reads cannot be unambiguously mapped to a specific locus. Single CELl LOng-read RNA-sequencing (CELLO-seq) combines long-read single cell RNA-sequencing with computational analyses to measure TE expression at unique loci. We used CELLO-seq to assess the widespread expression of TEs in two-cell mouse blastomeres as well as in human induced pluripotent stem cells. Across both species, old and young TEs showed evidence of locus-specific expression with simulations demonstrating that only a small number of very young elements in the mouse could not be mapped back to the reference with high confidence. Exploring the relationship between the expression of individual elements and putative regulators revealed large heterogeneity, with TEs within a class showing different patterns of correlation and suggesting distinct regulatory mechanisms.
- Published
- 2020
19. starmapVR: immersive visualisation of single cell spatial omic data
- Author
-
Joshua W. K. Ho, Dickson M. D. Siu, Yu Yao, Crystal S. M. Kwok, Xiunan Fang, Andrian Yang, Kevin K. Tsia, Yongyan Xia, Jianfu Li, and Michelle C. K. Lo
- Subjects
Spatial contextual awareness ,Human–computer interaction ,Computer science ,Scalability ,Context (language use) ,Google Cardboard ,Virtual reality ,Visualization - Abstract
MotivationAdvances in high throughput single-cell and spatial omic technologies have enabled the profiling of molecular expression and phenotypic properties of hundreds of thousands of individual cells in the context of their two dimensional (2D) or three dimensional (3D) spatial endogenous arrangement. However, current visualisation techniques do not allow for effective display and exploration of the single cell data in their spatial context. With the widespread availability of low-cost virtual reality (VR) gadgets, such as Google Cardboard, we propose that an immersive visualisation strategy is useful.ResultsWe present starmapVR, a light-weight, cross-platform, web-based tool for visualising single-cell and spatial omic data. starmapVR supports a number of interaction methods, such as keyboard, mouse, wireless controller and voice control. The tool visualises single cells in a 3D space and each cell can be represented by a star plot (for molecular expression, phenotypic properties) or image (for single cell imaging). For spatial transcriptomic data, the 2D single cell expression data can be visualised alongside the histological image in a 2.5D format. The application of starmapVR is demonstrated through a series of case studies. Its scalability has been carefully evaluated across different platforms.Availability and implementationstarmapVR is freely accessible athttps://holab-hku.github.io/starmapVR, with the corresponding source code available athttps://github.com/holab-hku/starmapVRunder the open source MIT license.Supplementary InformationSupplementary data are available atBioinformaticsonline.
- Published
- 2020
20. iSyTE 2.0: a database for expression-based gene discovery in the eye
- Author
-
Atul Kakrana, S. Deepthi Ramachandruni, Deepti Anand, Djordje Djordjevic, Joshua W. K. Ho, Andrian Yang, Abhyudai Singh, Hongzhan Huang, and Salil A. Lachke
- Subjects
0301 basic medicine ,Candidate gene ,Gene regulatory network ,Datasets as Topic ,Gene Expression ,Genome-wide association study ,Biology ,computer.software_genre ,Cataract ,03 medical and health sciences ,Mice ,User-Computer Interface ,Databases, Genetic ,Lens, Crystalline ,Genetics ,medicine ,Database Issue ,Animals ,Humans ,Gene Regulatory Networks ,Eye Proteins ,Genetic Association Studies ,Oligonucleotide Array Sequence Analysis ,Database ,Gene Expression Profiling ,Mice, Mutant Strains ,Visualization ,Gene expression profiling ,Disease Models, Animal ,030104 developmental biology ,medicine.anatomical_structure ,Lens (anatomy) ,Eye development ,computer ,Forecasting ,Genome-Wide Association Study - Abstract
Although successful in identifying new cataract-linked genes, the previous version of the database iSyTE (integrated Systems Tool for Eye gene discovery) was based on expression information on just three mouse lens stages and was functionally limited to visualization by only UCSC-Genome Browser tracks. To increase its efficacy, here we provide an enhanced iSyTE version 2.0 (URL: http://research.bioinformatics.udel.edu/iSyTE) based on well-curated, comprehensive genome-level lens expression data as a one-stop portal for the effective visualization and analysis of candidate genes in lens development and disease. iSyTE 2.0 includes all publicly available lens Affymetrix and Illumina microarray datasets representing a broad range of embryonic and postnatal stages from wild-type and specific gene-perturbation mouse mutants with eye defects. Further, we developed a new user-friendly web interface for direct access and cogent visualization of the curated expression data, which supports convenient searches and a range of downstream analyses. The utility of these new iSyTE 2.0 features is illustrated through examples of established genes associated with lens development and pathobiology, which serve as tutorials for its application by the end-user. iSyTE 2.0 will facilitate the prioritization of eye development and disease-linked candidate genes in studies involving transcriptomics or next-generation sequencing data, linkage analysis and GWAS approaches.
- Published
- 2017
21. Integrative analysis identifies co-dependent gene expression regulation of BRG1 and CHD7 at distal regulatory sites in embryonic stem cells
- Author
-
Andrew J. Oldfield, Taiyun Kim, Pengyi Yang, Jean Yee Hwa Yang, Andrian Yang, and Joshua W. K. Ho
- Subjects
0301 basic medicine ,Statistics and Probability ,Chromatin Immunoprecipitation ,Computational biology ,Biochemistry ,Cell Line ,Mice ,03 medical and health sciences ,Animals ,Binding site ,Discovery Notes ,Molecular Biology ,Gene ,Transcription factor ,Embryonic Stem Cells ,Regulation of gene expression ,Genetics ,biology ,DNA Helicases ,Gene Expression Regulation, Developmental ,Nuclear Proteins ,Promoter ,Sequence Analysis, DNA ,Computer Science Applications ,Chromatin ,DNA-Binding Proteins ,Computational Mathematics ,030104 developmental biology ,Histone ,Computational Theory and Mathematics ,biology.protein ,Chromatin immunoprecipitation ,Software ,Transcription Factors - Abstract
Motivation DNA binding proteins such as chromatin remodellers, transcription factors (TFs), histone modifiers and co-factors often bind cooperatively to activate or repress their target genes in a cell type-specific manner. Nonetheless, the precise role of cooperative binding in defining cell-type identity is still largely uncharacterized. Results Here, we collected and analyzed 214 public datasets representing chromatin immunoprecipitation followed by sequencing (ChIP-Seq) of 104 DNA binding proteins in embryonic stem cell (ESC) lines. We classified their binding sites into those proximal to gene promoters and those in distal regions, and developed a web resource called Proximal And Distal (PAD) clustering to identify their co-localization at these respective regions. Using this extensive dataset, we discovered an extensive co-localization of BRG1 and CHD7 at distal but not proximal regions. The comparison of co-localization sites to those bound by either BRG1 or CHD7 alone showed an enrichment of ESC master TFs binding and active chromatin architecture at co-localization sites. Most notably, our analysis reveals the co-dependency of BRG1 and CHD7 at distal regions on regulating expression of their common target genes in ESC. This work sheds light on cooperative binding of TF binding proteins in regulating gene expression in ESC, and demonstrates the utility of integrative analysis of a manually curated compendium of genome-wide protein binding profiles in our online resource PAD. Availability and Implementation PAD is freely available at http://pad.victorchang.edu.au/ and its source code is available via an open source GPL 3.0 license at https://github.com/VCCRI/PAD/ Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2017
22. Scalability and Validation of Big Data Bioinformatics Software
- Author
-
Michael Troup, Joshua W. K. Ho, and Andrian Yang
- Subjects
0301 basic medicine ,Distributed Computing Environment ,Computer science ,business.industry ,lcsh:Biotechnology ,Big data ,Biophysics ,Scalability testing ,Cloud computing ,Biochemistry ,Data science ,Computer Science Applications ,03 medical and health sciences ,030104 developmental biology ,Software ,Structural Biology ,lcsh:TP248.13-248.65 ,Scalability ,Genetics ,Software verification and validation ,Metamorphic testing ,Short Survey ,business ,Biotechnology - Abstract
This review examines two important aspects that are central to modern big data bioinformatics analysis - software scalability and validity. We argue that not only are the issues of scalability and validation common to all big data bioinformatics analyses, they can be tackled by conceptually related methodological approaches, namely divide-and-conquer (scalability) and multiple executions (validation). Scalability is defined as the ability for a program to scale based on workload. It has always been an important consideration when developing bioinformatics algorithms and programs. Nonetheless the surge of volume and variety of biological and biomedical data has posed new challenges. We discuss how modern cloud computing and big data programming frameworks such as MapReduce and Spark are being used to effectively implement divide-and-conquer in a distributed computing environment. Validation of software is another important issue in big data bioinformatics that is often ignored. Software validation is the process of determining whether the program under test fulfils the task for which it was designed. Determining the correctness of the computational output of big data bioinformatics software is especially difficult due to the large input space and complex algorithms involved. We discuss how state-of-the-art software testing techniques that are based on the idea of multiple executions, such as metamorphic testing, can be used to implement an effective bioinformatics quality assurance strategy. We hope this review will raise awareness of these critical issues in bioinformatics.
- Published
- 2017
23. Scavenger: A pipeline for recovery of unaligned reads utilising similarity with aligned reads
- Author
-
Joshua W. K. Ho, Andrian Yang, Joshua Y. S. Tang, and Michael Troup
- Subjects
Read alignment ,Computer science ,Pseudogene ,Pipeline (computing) ,0206 medical engineering ,Read recovery ,02 engineering and technology ,Computational biology ,Genome ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,0302 clinical medicine ,Similarity (network science) ,Differential expression ,General Pharmacology, Toxicology and Pharmaceutics ,030304 developmental biology ,0303 health sciences ,General Immunology and Microbiology ,Software Tool Article ,Genetic variants ,General Medicine ,Articles ,030220 oncology & carcinogenesis ,RNA-seq ,Unaligned read ,020602 bioinformatics ,Reference genome - Abstract
MotivationRead alignment is an important step in RNA-seq analysis as the result of alignment forms the basis for further downstream analyses. However, recent studies have shown that published alignment tools have variable mapping sensitivity and do not necessarily align reads which should have been aligned, a problem we termed as the false-negative non-alignment problem.ResultsWe have developed Scavenger, a pipeline for recovering unaligned reads using a novel mechanism which utilises information from aligned reads. Scavenger performs recovery of unaligned reads by re-aligning unaligned reads against a putative location derived from aligned reads with sequence similarity against unaligned reads. We show that Scavenger can successfully recover unaligned reads in both simulated and real RNA-seq datasets, including single-cell RNA-seq data. The reads recovered contain more genetic variants compared to previously aligned reads, indicating that divergence between personal and reference genomes plays a role in the false-negative non-alignment problem. We also explored the impact of read recovery on downstream analyses, in particular gene expression analysis, and showed that Scavenger is able to both recover genes which were previously non-expressed and also increase gene expression, with lowly expressed genes having the most impact from the addition of recovered reads. We also found that the majority of genes with >1 fold change in expression after recovery are categorised as pseudogenes, indicating that pseudogene expression can be affected by the false-negative non-alignment problem. Scavenger helps to solve the false-negative non-alignment problem through recovery of unaligned reads using information from previously aligned reads.AvailabilityScavenger is available via an open source license in https://github.com/VCCRI/Scavenger/Contactj.ho@victorchang.edu.au
- Published
- 2019
24. Harnessing Multiple Source Test Cases in Metamorphic Testing: A Case Study in Bioinformatics
- Author
-
Joshua Y.S. Tang, Andrian Yang, Tsong Yueh Chen, and Joshua W.K. Ho
- Published
- 2017
25. Falco: A quick and flexible single-cell RNA-seq processing framework on the cloud
- Author
-
Joshua W. K. Ho, Andrian Yang, Peijie Lin, and Michael Troup
- Subjects
0301 basic medicine ,Statistics and Probability ,Computer science ,Distributed computing ,Big data ,Gene Expression ,Cloud computing ,02 engineering and technology ,computer.software_genre ,Biochemistry ,Reduction (complexity) ,Mice ,03 medical and health sciences ,Software ,020204 information systems ,Spark (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,Animals ,Humans ,Gene Regulatory Networks ,Molecular Biology ,Massively parallel ,030304 developmental biology ,0303 health sciences ,Database ,Sequence Analysis, RNA ,business.industry ,Gene Expression Profiling ,Process (computing) ,Computational Biology ,Dendritic Cells ,Pipeline (software) ,Computer Science Applications ,Computational Mathematics ,TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES ,030104 developmental biology ,Computational Theory and Mathematics ,Scalability ,RNA ,Data mining ,Single-Cell Analysis ,business ,computer ,Algorithms - Abstract
SummarySingle-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellisation of existing RNA-seq processing pipelines using big data technologies of Apache Hadoop and Apache Spark for performing massively parallel analysis of large scale transcriptomic data. Using two public scRNA-seq data sets and two popular RNA-seq alignment/feature quantification pipelines, we show that the same processing pipeline runs 2.6 – 145.4 times faster using Falco than running on a highly optimised single node analysis. Falco also allows user to the utilise low-cost spot instances of Amazon Web Services (AWS), providing a 65% reduction in cost of analysis.AvailabilityFalco is available via a GNU General Public License at https://github.com/VCCRI/Falco/Contactj.ho@victorchang.edu.auSupplementary informationSupplementary data are available at BioRXiv online.
- Published
- 2016
- Full Text
- View/download PDF
26. A cloud-based framework for applying metamorphic testing to a bioinformatics pipeline
- Author
-
Eleni Giannoulatou, Tsong Yueh Chen, Andrian Yang, Amir Hossein Kamali, Joshua W. K. Ho, and Michael Troup
- Subjects
0301 basic medicine ,Computer science ,business.industry ,020207 software engineering ,Genomics ,Cloud computing ,02 engineering and technology ,computer.software_genre ,Pipeline (software) ,Oracle ,Pipeline transport ,03 medical and health sciences ,030104 developmental biology ,Bioinformatics software ,0202 electrical engineering, electronic engineering, information engineering ,Data mining ,Metamorphic testing ,Software engineering ,business ,computer ,Testing software - Abstract
Testing of bioinformatics software often suffers from the oracle problem, especially when testing software that analyses human genome sequencing data. Metamorphic testing has been proposed to alleviate the oracle problem. Nonetheless, smaller research or clinical centres may be challenged by the complexity and resources required to implement a suitable metamorphic testing framework in practice. This paper presents a case study on how a cloud-based metamorphic testing framework can be applied to a widely used genomic sequencing pipeline, and discusses the future of implementing large-scale on-demand automated metamorphic testing using cloud-based resources.
- Published
- 2016
27. PBrowse: a web-based platform for real-time collaborative exploration of genomic data
- Author
-
Uwe Röhm, Xin Wang, Joshua W. K. Ho, Andrian Yang, Chirag Parsania, Peter S Szot, and Koon Ho Wong
- Subjects
0301 basic medicine ,Source code ,Computer science ,media_common.quotation_subject ,Data management ,0206 medical engineering ,Information Storage and Retrieval ,02 engineering and technology ,Genome browser ,Biology ,Web Browser ,Bioinformatics ,ENCODE ,Genome ,World Wide Web ,03 medical and health sciences ,Annotation ,Collaborative editing ,User-Computer Interface ,Data visualization ,File sharing ,Databases, Genetic ,Genetics ,Web application ,Humans ,Cooperative Behavior ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,media_common ,030304 developmental biology ,0303 health sciences ,Internet ,business.industry ,Genome, Human ,Gene Annotation ,Visualization ,030104 developmental biology ,Methods Online ,The Internet ,business ,020602 bioinformatics - Abstract
SummaryThe central task of a genome browser is to enable easy visual exploration of large genomic data to gain biological insight. Most existing genome browsers were designed for data exploration by individual users, while a few allow some limited forms of collaboration among multiple users, such as file sharing and wiki-style collaborative editing of gene annotations. Our work’s premise is that allowing sharing of genome browser views instantaneously in real-time enables the exchange of ideas and insight in a collaborative project, thus harnessing the wisdom of the crowd. PBrowse is a parallel-access real-time collaborative web-based genome browser that provides both an integrated, real-time collaborative platform and a comprehensive file sharing system. PBrowse also allows real-time track comment and has integrated group chat to facilitate interactive discussion among multiple users. Through the Distributed Annotation Server protocol, PBrowse can easily access a wide range of publicly available genomic data, such as the ENCODE data sets. We argue that PBrowse, with the re-designed user management, data management and novel collaborative layer based on Biodalliance, represents a paradigm shift from seeing genome browser merely as a tool of data visualisation to a tool that enables real-time human-human interaction and knowledge exchange in a collaborative setting.AvailabilityPBrowse is available at http://pbrowse.victorchang.edu.au, and its source code is available via the open source BSD 3 license at http://github.com/VCCRI/PBrowse.Contactj.ho@victorchang.edu.auSupplementary InformationSupplementary video demonstrating collaborative feature of pbrowse is available in https://www.youtube.com/watch?v=ROvKXZoXiIc.
- Published
- 2016
28. Telomerase Reverse Transcriptase Over-Expression Enhances Human Cardiac Progenitor Cell Cardiac Regeneration after Myocardial Infarction
- Author
-
Andrian Yang, Joshua W. K. Ho, Melad Farraha, C.G. dos Remedios, Sile F. Yang, Hilda A. Pickett, James J.H. Chong, Sujitha Thavapalachandran, L. Le, and Eddy Kizana
- Subjects
Pulmonary and Respiratory Medicine ,Cardiac regeneration ,business.industry ,medicine ,Cancer research ,Over expression ,Cardiac Progenitor Cell ,Telomerase reverse transcriptase ,Myocardial infarction ,Cardiology and Cardiovascular Medicine ,medicine.disease ,business - Published
- 2017
29. Proteomic Characterisation of Extracellular Vesicles Derived From Human Cardiac Progenitor Cells
- Author
-
Eddy Kizana, Joshua W. K. Ho, L. Le, R. Bao, Andrian Yang, James J.H. Chong, and C.G. dos Remedios
- Subjects
Pulmonary and Respiratory Medicine ,Cardiac progenitors ,business.industry ,Medicine ,Cardiology and Cardiovascular Medicine ,business ,Extracellular vesicles ,Cell biology - Published
- 2018
30. Decoding the complex genetic causes of heart diseases using systems biology
- Author
-
Vinita Deshpande, Joshua W. K. Ho, Andrian Yang, David T. Humphreys, Tomasz Szczesnik, Eleni Giannoulatou, and Djordje Djordjevic
- Subjects
Whole genome sequencing ,Genetics ,Systems biology ,Biophysics ,Genomics ,Computational biology ,Disease ,Review ,Biology ,Structural Biology ,Epigenetics ,Molecular Biology ,Exome sequencing ,Epigenomics ,Genetic association - Abstract
The pace of disease gene discovery is still much slower than expected, even with the use of cost-effective DNA sequencing and genotyping technologies. It is increasingly clear that many inherited heart diseases have a more complex polygenic aetiology than previously thought. Understanding the role of gene–gene interactions, epigenetics, and non-coding regulatory regions is becoming increasingly critical in predicting the functional consequences of genetic mutations identified by genome-wide association studies and whole-genome or exome sequencing. A systems biology approach is now being widely employed to systematically discover genes that are involved in heart diseases in humans or relevant animal models through bioinformatics. The overarching premise is that the integration of high-quality causal gene regulatory networks (GRNs), genomics, epigenomics, transcriptomics and other genome-wide data will greatly accelerate the discovery of the complex genetic causes of congenital and complex heart diseases. This review summarises state-of-the-art genomic and bioinformatics techniques that are used in accelerating the pace of disease gene discovery in heart diseases. Accompanying this review, we provide an interactive web-resource for systems biology analysis of mammalian heart development and diseases, CardiacCode ( http://CardiacCode.victorchang.edu.au/ ). CardiacCode features a dataset of over 700 pieces of manually curated genetic or molecular perturbation data, which enables the inference of a cardiac-specific GRN of 280 regulatory relationships between 33 regulator genes and 129 target genes. We believe this growing resource will fill an urgent unmet need to fully realise the true potential of predictive and personalised genomic medicine in tackling human heart disease.
- Published
- 2014
31. A cloud-based framework for applying metamorphic testing to a bioinformatics pipeline.
- Author
-
Troup, Michael, Andrian Yang, Kamali, Amir Hossein, Giannoulatou, Eleni, Tsong Yueh Chen, and Ho, Joshua W. K.
- Published
- 2016
- Full Text
- View/download PDF
32. PBrowse: a web-based platform for real-time collaborative exploration of genomic data.
- Author
-
Szot, Peter S., Andrian Yang, Xin Wang, Parsania, Chirag, Röhm, Uwe, Koon Ho Wong, and Ho, Joshua W. K.
- Published
- 2017
- Full Text
- View/download PDF
33. Falco: a quick and flexible single-cell RNA-seq processing framework on the cloud.
- Author
-
Andrian Yang, Troup, Michael, Peijie Lin, and Ho, JoshuaW. K.
- Subjects
- *
RNA sequencing , *MEDICAL research , *CLOUD computing - Abstract
Summary: Single-cell RNA-seq (scRNA-seq) is increasingly used in a range of biomedical studies. Nonetheless, current RNA-seq analysis tools are not specifically designed to efficiently process scRNA-seq data due to their limited scalability. Here we introduce Falco, a cloud-based framework to enable paralellization of existing RNA-seq processing pipelines using big data technologies of Apache Hadoop and Apache Spark for performing massively parallel analysis of large scale transcriptomic data. Using two public scRNA-seq datasets and two popular RNA-seq alignment/feature quantification pipelines, we show that the same processing pipeline runs 2.6–145.4 times faster using Falco than running on a highly optimized standalone computer. Falco also allows users to utilize low-cost spot instances of Amazon Web Services, providing a ~65% reduction in cost of analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.