21 results on '"Amin Allahyar"'
Search Results
2. A data-driven interactome of synergistic genes improves network-based cancer outcome prediction.
- Author
-
Amin Allahyar, Joske Ubels, and Jeroen de Ridder
- Subjects
Biology (General) ,QH301-705.5 - Abstract
Robustly predicting outcome for cancer patients from gene expression is an important challenge on the road to better personalized treatment. Network-based outcome predictors (NOPs), which considers the cellular wiring diagram in the classification, hold much promise to improve performance, stability and interpretability of identified marker genes. Problematically, reports on the efficacy of NOPs are conflicting and for instance suggest that utilizing random networks performs on par to networks that describe biologically relevant interactions. In this paper we turn the prediction problem around: instead of using a given biological network in the NOP, we aim to identify the network of genes that truly improves outcome prediction. To this end, we propose SyNet, a gene network constructed ab initio from synergistic gene pairs derived from survival-labelled gene expression data. To obtain SyNet, we evaluate synergy for all 69 million pairwise combinations of genes resulting in a network that is specific to the dataset and phenotype under study and can be used to in a NOP model. We evaluated SyNet and 11 other networks on a compendium dataset of >4000 survival-labelled breast cancer samples. For this purpose, we used cross-study validation which more closely emulates real world application of these outcome predictors. We find that SyNet is the only network that truly improves performance, stability and interpretability in several existing NOPs. We show that SyNet overlaps significantly with existing gene networks, and can be confidently predicted (~85% AUC) from graph-topological descriptions of these networks, in particular the breast tissue-specific network. Due to its data-driven nature, SyNet is not biased to well-studied genes and thus facilitates post-hoc interpretation. We find that SyNet is highly enriched for known breast cancer genes and genes related to e.g. histological grade and tamoxifen resistance, suggestive of a role in determining breast cancer outcome.
- Published
- 2019
- Full Text
- View/download PDF
3. Gamma-Retrovirus Integration Marks Cell Type-Specific Cancer Genes: A Novel Profiling Tool in Cancer Genomics.
- Author
-
Kathryn L Gilroy, Anne Terry, Asif Naseer, Jeroen de Ridder, Amin Allahyar, Weiwei Wang, Eric Carpenter, Andrew Mason, Gane K-S Wong, Ewan R Cameron, Anna Kilbey, and James C Neil
- Subjects
Medicine ,Science - Abstract
Retroviruses have been foundational in cancer research since early studies identified proto-oncogenes as targets for insertional mutagenesis. Integration of murine gamma-retroviruses into the host genome favours promoters and enhancers and entails interaction of viral integrase with host BET/bromodomain factors. We report that this integration pattern is conserved in feline leukaemia virus (FeLV), a gamma-retrovirus that infects many human cell types. Analysis of FeLV insertion sites in the MCF-7 mammary carcinoma cell line revealed strong bias towards active chromatin marks with no evidence of significant post-integration growth selection. The most prominent FeLV integration targets had little overlap with the most abundantly expressed transcripts, but were strongly enriched for annotated cancer genes. A meta-analysis based on several gamma-retrovirus integration profiling (GRIP) studies in human cells (CD34+, K562, HepG2) revealed a similar cancer gene bias but also remarkable cell-type specificity, with prominent exceptions including a universal integration hotspot at the long non-coding RNA MALAT1. Comparison of GRIP targets with databases of super-enhancers from the same cell lines showed that these have only limited overlap and that GRIP provides unique insights into the upstream drivers of cell growth. These observations elucidate the oncogenic potency of the gamma-retroviruses and support the wider application of GRIP to identify the genes and growth regulatory circuits that drive distinct cancer types.
- Published
- 2016
- Full Text
- View/download PDF
4. Targeted cohesin loading characterizes the entry and exit sites of loop extrusion trajectories
- Author
-
Ruiqi Han, Yike Huang, Iwan Vaandrager, Amin Allahyar, Mikhail Magnitov, Marjon J.A.M. Verstegen, Elzo de Wit, Peter H.L. Krijger, and Wouter de Laat
- Abstract
The cohesin complex (SMC1-SMC3-RAD21) shapes chromosomes by DNA loop extrusion, but individual extrusion trajectories were so far unappreciable in vivo. Here, we site-specifically induced dozens of extrusion trajectories anchored at artificial loading sites in living cells. Extruding cohesin transports loading proteins MAU2-NIPBL over megabase DNA distances to blocking CTCF sites that then loop back to the loading sequences, showing that CTCF-CTCF interactions are unnecessary for stabilized contacts between loop extrusion obstacles. When stalled, cohesin can block other extruding cohesin from either direction. Without RAD21, MAU2-NIPBL exclusively accumulate at loading sites, here genome-wide defined as enhancers. SMC1 now also selectively accumulates here, suggesting that cohesin may load modularly on chromatin. Genes inside high cohesin extrusion trajectories are collectively hindered in transcription. This work characterizes the impact, entry and exit sites of individual cohesin loop extrusion trajectories.
- Published
- 2023
5. FERAL: network-based classifier with application to breast cancer outcome prediction.
- Author
-
Amin Allahyar and Jeroen de Ridder
- Published
- 2015
- Full Text
- View/download PDF
6. Constrained Semi-Supervised Growing Self-Organizing Map.
- Author
-
Amin Allahyar, Hadi Sadoghi Yazdi, and Ahad Harati
- Published
- 2015
- Full Text
- View/download PDF
7. Fast Feature Reduction in intrusion detection datasets.
- Author
-
Shafigh Parsazad, Ehsan Saboori, and Amin Allahyar
- Published
- 2012
8. Online discriminative component analysis feature extraction from stream data with domain knowledge.
- Author
-
Amin Allahyar and Hadi Sadoghi Yazdi
- Published
- 2014
- Full Text
- View/download PDF
9. Robust detection of translocations in lymphoma FFPE samples using targeted locus capture-based sequencing
- Author
-
Joost Vermaat, Tom van Wezel, Paula J P de Vree, Wouter de Laat, Bauke Ylstra, Amin Allahyar, Marieke Simonis, Harma Feitsma, Adrien S. J. Melquiond, Max van Min, Agata Rakszewska, Erik Splinter, Daphne de Jong, Joost Swennenhuis, Milan Sharma, Mehmet Yilmaz, Arjan Diepstra, Roos J Leguit, Robert van der Geize, Phylicia Stathi, Karima Hajo, Nathalie J. Hijmering, Mark Pieterse, Marjon J.A.M. Verstegen, Peter H.L. Krijger, Ruud W J Meijers, G Tjitske Los-de Vries, Léon C van Kempen, Arjen H.G. Cleven, Pathology, VU University medical center, CCA - Imaging and biomarkers, Hubrecht Institute for Developmental Biology and Stem Cell Research, and Stem Cell Aging Leukemia and Lymphoma (SALL)
- Subjects
0301 basic medicine ,Tissue Fixation ,Lymphoma ,Non-Hodgkin/diagnosis ,Genes, myc ,General Physics and Astronomy ,Chromosomal translocation ,MYC ,Translocation, Genetic ,0302 clinical medicine ,Cancer genomics ,bcl-2/genetics ,B-Cell/diagnosis ,B-cell lymphoma ,In Situ Hybridization ,In Situ Hybridization, Fluorescence ,Gene Rearrangement ,High-Throughput Nucleotide Sequencing/methods ,Multidisciplinary ,Paraffin Embedding ,medicine.diagnostic_test ,Genes, bcl-2/genetics ,Lymphoma, Non-Hodgkin ,REARRANGEMENTS ,In Situ Hybridization, Fluorescence/methods ,High-Throughput Nucleotide Sequencing ,Proto-Oncogene Proteins c-bcl-6/genetics ,030220 oncology & carcinogenesis ,Proto-Oncogene Proteins c-bcl-6 ,Biomedical engineering ,EXPRESSION ,Lymphoma, B-Cell ,Lymphoma, Non-Hodgkin/diagnosis ,Paraffin Embedding/methods ,Science ,Translocation ,Locus (genetics) ,Computational biology ,Biology ,Fluorescence/methods ,Sensitivity and Specificity ,General Biochemistry, Genetics and Molecular Biology ,Article ,03 medical and health sciences ,Genetic ,medicine ,Humans ,Genes, myc/genetics ,Retrospective Studies ,business.industry ,Lymphoma, B-Cell/diagnosis ,Cancer ,B-CELL LYMPHOMA ,Computational Biology ,Reproducibility of Results ,General Chemistry ,Gene rearrangement ,Computational Biology/methods ,medicine.disease ,Personalized medicine ,Genes, bcl-2 ,030104 developmental biology ,Genes ,myc/genetics ,Tissue Fixation/methods ,business ,Fluorescence in situ hybridization - Abstract
In routine diagnostic pathology, cancer biopsies are preserved by formalin-fixed, paraffin-embedding (FFPE) procedures for examination of (intra-) cellular morphology. Such procedures inadvertently induce DNA fragmentation, which compromises sequencing-based analyses of chromosomal rearrangements. Yet, rearrangements drive many types of hematolymphoid malignancies and solid tumors, and their manifestation is instructive for diagnosis, prognosis, and treatment. Here, we present FFPE-targeted locus capture (FFPE-TLC) for targeted sequencing of proximity-ligation products formed in FFPE tissue blocks, and PLIER, a computational framework that allows automated identification and characterization of rearrangements involving selected, clinically relevant, loci. FFPE-TLC, blindly applied to 149 lymphoma and control FFPE samples, identifies the known and previously uncharacterized rearrangement partners. It outperforms fluorescence in situ hybridization (FISH) in sensitivity and specificity, and shows clear advantages over standard capture-NGS methods, finding rearrangements involving repetitive sequences which they typically miss. FFPE-TLC is therefore a powerful clinical diagnostics tool for accurate targeted rearrangement detection in FFPE specimens., Preservation of cancer biopsies by FFPE introduces DNA fragmentation, hindering analysis of rearrangements. Here the authors introduce FFPE Targeted Locus Capture for identification of translocations in preserved samples.
- Published
- 2021
10. Interplay between CTCF boundaries and a super enhancer controls cohesin extrusion trajectories and gene expression
- Author
-
Anna-Karina Felder, Floor van der Vegt, Christian Valdes-Quezada, Wouter de Laat, Esther C.H. Uijttewaal, Amin Allahyar, Peter H.L. Krijger, E.S. Vos, Yike Huang, Marjon J.A.M. Verstegen, and Hubrecht Institute for Developmental Biology and Stem Cell Research
- Subjects
CCCTC-Binding Factor ,Enhancer Elements ,Chromosomal Proteins, Non-Histone ,Transcription Factors/genetics ,Gene Expression ,Cell Cycle Proteins ,Biology ,CCCTC-Binding Factor/genetics ,Promoter Regions ,Chromosomal Proteins, Non-Histone/genetics ,03 medical and health sciences ,Nuclear Receptor Coactivator 2 ,Mice ,0302 clinical medicine ,Genetic ,Transcription (biology) ,Animals ,Enhancer ,Promoter Regions, Genetic ,Molecular Biology ,030304 developmental biology ,Nuclear Receptor Coactivator 2/genetics ,Regulation of gene expression ,Cohesin loading ,Cell Cycle Proteins/genetics ,0303 health sciences ,Cohesin ,Chromatin/genetics ,RNA-Binding Proteins ,Promoter ,Mouse Embryonic Stem Cells ,Cell Biology ,RNA-Binding Proteins/genetics ,Chromatin ,Cell biology ,DNA-Binding Proteins ,Chromosomal Proteins ,Enhancer Elements, Genetic ,CTCF ,Non-Histone/genetics ,030217 neurology & neurosurgery ,DNA-Binding Proteins/genetics ,Transcription Factors - Abstract
To understand how chromatin domains coordinate gene expression, we dissected select genetic elements organizing topology and transcription around the Prdm14 super enhancer in mouse embryonic stem cells. Taking advantage of allelic polymorphisms, we developed methods to sensitively analyze changes in chromatin topology, gene expression, and protein recruitment. We show that enhancer insulation does not rely strictly on loop formation between its flanking boundaries, that the enhancer activates the Slco5a1 gene beyond its prominent domain boundary, and that it recruits cohesin for loop extrusion. Upon boundary inversion, we find that oppositely oriented CTCF terminates extrusion trajectories but does not stall cohesin, while deleted or mutated CTCF sites allow cohesin to extend its trajectory. Enhancer-mediated gene activation occurs independent of paused loop extrusion near the gene promoter. We expand upon the loop extrusion model to propose that cohesin loading and extrusion trajectories originating at an enhancer contribute to gene activation.
- Published
- 2021
11. Fast Feature Reduction in intrusion detection datasets
- Author
-
Shafigh Parsazad, Ehsan Saboori, and Amin Allahyar
- Published
- 2013
12. Data Selection for Semi-Supervised Learning
- Author
-
Shafigh Parsazad, Ehsan Saboori, and Amin Allahyar
- Published
- 2012
13. Multi-contact 4C: long-molecule sequencing of complex proximity ligation products to uncover local cooperative and competitive chromatin topologies
- Author
-
Marjon J.A.M. Verstegen, Jeroen de Ridder, Ivo Renkens, Carlo Vermeulen, Wigard P. Kloosterman, Wouter de Laat, Britta A M Bouwman, Roy Straver, Peter H.L. Krijger, Christian Valdes-Quezada, Geert Geeven, Amin Allahyar, and Hubrecht Institute for Developmental Biology and Stem Cell Research
- Subjects
Sequence analysis ,Concatemer ,Computer science ,Molecular Conformation ,Computational biology ,Biochemistry ,Chromatin structure ,General Biochemistry, Genetics and Molecular Biology ,DNA sequencing ,Chromosome conformation capture ,03 medical and health sciences ,chemistry.chemical_compound ,0302 clinical medicine ,Journal Article ,Humans ,030304 developmental biology ,0303 health sciences ,Biochemistry, Genetics and Molecular Biology(all) ,Inverse polymerase chain reaction ,Sequence Analysis, DNA ,Chromatin ,DNA/methods ,Data processing ,chemistry ,Sequence Analysis, DNA/methods ,Next-generation sequencing ,Nanopore sequencing ,Chromatin/chemistry ,K562 Cells ,Sequence Analysis ,030217 neurology & neurosurgery ,DNA ,Genetics and Molecular Biology(all) - Abstract
We present the experimental protocol and data analysis toolbox for multi-contact 4C (MC-4C), a new proximity ligation method tailored to study the higher-order chromatin contact patterns of selected genomic sites. Conventional chromatin conformation capture (3C) methods fragment proximity ligation products for efficient analysis of pairwise DNA contacts. By contrast, MC-4C is designed to preserve and collect large concatemers of proximity ligated fragments for long-molecule sequencing on an Oxford Nanopore or Pacific Biosciences platform. Each concatemer of proximity ligation products represents a snapshot topology of a different individual allele, revealing its multi-way chromatin interactions. By inverse PCR with primers specific for a fragment of interest (the viewpoint) and DNA size selection, sequencing is selectively targeted to thousands of different complex interactions containing this viewpoint. A tailored statistical analysis toolbox is able to generate background models and three-way interaction profiles from the same dataset. These profiles can be used to distinguish whether contacts between more than two regulatory sequences are mutually exclusive or, conversely, simultaneously occurring at chromatin hubs. The entire procedure can be completed in 2 w, and requires standard molecular biology and data analysis skills and equipment, plus access to a third-generation sequencing platform.
- Published
- 2020
14. A lentiviral vector‐based insertional mutagenesis screen identifies mechanisms of resistance to MAPK inhibitors in melanoma
- Author
-
Ultan McDermott, Vivek Iyer, Marco Ranzani, David J. Adams, Stacey Price, Gemma Turner, Alistair G. Rust, Jeroen de Ridder, Constantine Alifrangis, Nicola A. Thompson, Peter R. Ellis, and Amin Allahyar
- Subjects
Genetic Vectors ,Dermatology ,Biology ,General Biochemistry, Genetics and Molecular Biology ,Viral vector ,Insertional mutagenesis ,03 medical and health sciences ,0302 clinical medicine ,Cell Line, Tumor ,medicine ,Humans ,Genetic Testing ,Melanoma ,Protein Kinase Inhibitors ,030304 developmental biology ,0303 health sciences ,Lentivirus ,medicine.disease ,Mutagenesis, Insertional ,Oncology ,MAPK Inhibitors ,Drug Resistance, Neoplasm ,030220 oncology & carcinogenesis ,Cancer research ,Mitogen-Activated Protein Kinases - Published
- 2018
15. Enhancer hubs and loop collisions identified from single-allele topologies
- Author
-
Carlo Vermeulen, Marjon J.A.M. Verstegen, Wigard P. Kloosterman, Mark Pieterse, Hans Teunissen, Britta A M Bouwman, Judith H.I. Haarhuis, Amin Allahyar, Peter H.L. Krijger, Elzo de Wit, Jeroen de Ridder, Ivo Renkens, Roy Straver, Benjamin D. Rowland, Kees Jalink, Wouter de Laat, Geert Geeven, Melissa van Kranenburg, and Hubrecht Institute for Developmental Biology and Stem Cell Research
- Subjects
0301 basic medicine ,CCCTC-Binding Factor ,Cohesin ,Genomics ,beta-Globins ,Computational biology ,Regulatory Sequences, Nucleic Acid ,Biology ,Genome ,Chromatin ,Folding (chemistry) ,Mice ,03 medical and health sciences ,Enhancer Elements, Genetic ,030104 developmental biology ,Genetics ,Animals ,Nucleic Acid Conformation ,Nanopore sequencing ,Enhancer ,Gene ,Alleles - Abstract
Chromatin folding contributes to the regulation of genomic processes such as gene activity. Existing conformation capture methods characterize genome topology through analysis of pairwise chromatin contacts in populations of cells but cannot discern whether individual interactions occur simultaneously or competitively. Here we present multi-contact 4C (MC-4C), which applies Nanopore sequencing to study multi-way DNA conformations of individual alleles. MC-4C distinguishes cooperative from random and competing interactions and identifies previously missed structures in subpopulations of cells. We show that individual elements of the β-globin superenhancer can aggregate into an enhancer hub that can simultaneously accommodate two genes. Neighboring chromatin domain loops can form rosette-like structures through collision of their CTCF-bound anchors, as seen most prominently in cells lacking the cohesin-unloading factor WAPL. Here, massive collision of CTCF-anchored chromatin loops is believed to reflect ‘cohesin traffic jams’. Single-allele topology studies thus help us understand the mechanisms underlying genome folding and functioning.
- Published
- 2018
16. Multi-Contact 4C (MC-4C): long molecule sequencing of complex proximity ligation products to uncover local cooperative and competitive chromatin topologies v1
- Author
-
Carlo Vermeulen, Amin Allahyar, Britta A.M. Bouwman, Peter H.L. Krijger, Marjon J.A.M. Verstegen, Geert Geeven, Christian Valdes-Quezada, Ivo Renkens, Roy Straver, Wigard P. Kloosterman, Jeroen de Ridder, and Wouter de Laat
- Abstract
We present the protocol and data analysis toolbox for Multi-Contact 4C (MC-4C), a new proximity ligation method tailored to study the higher-order chromatin contact patterns of selected genomic sites. Conventional chromosome conformation capture (3C) methods fragment proximity ligation products for efficient analysis of pairwise DNA contacts. In contrast, MC-4C is designed to preserve and collect large concatemers of proximity ligated fragments for long molecule sequencing on Oxford Nanopore or Pacific Biosciences platforms, thus allowing study of multi-way chromatin interactions. By inverse PCR with primers specific for a fragment of interest (the viewpoint) and DNA size selection, sequencing is selectively targeted to thousands of different complex interactions containing this viewpoint. A tailored statistical analysis toolbox employing data-intrinsic background models then discerns whether contacts between more than two regulatory sequences are mutually exclusive or, conversely, simultaneously happening at chromatin hubs. The entire procedure can be completed in two weeks and requires access to a third generation sequencing platform.
- Published
- 2019
17. Abstract PO-45: Robust detection of translocations in lymphoma FFPE samples using Targeted Locus Capture-based sequencing
- Author
-
Arjan Diepstra, Tom van Wezel, Joost Vermaat, Erik Splinter, Max van Min, Bauke Ylstra, Nathalie J. Hijmering, Daphne de Jong, Mark Pieterse, Karima Hajo, Roos J Leguit, Robert van der Geize, Wouter de Laat, Ruud W J Meijers, Léon C van Kempen, Arjen H.G. Cleven, Marieke Simonis, Tjitske Los-de Vries, Mehmet Yilmaz, Harma Feitsma, Joost Swennenhuis, and Amin Allahyar
- Subjects
medicine.diagnostic_test ,Breakpoint ,Chromosomal translocation ,Locus (genetics) ,General Medicine ,Computational biology ,Biology ,medicine.disease ,BCL6 ,Lymphoma ,Chromosome 3 ,medicine ,Gene ,Fluorescence in situ hybridization - Abstract
Chromosomal translocations with immunoglobin (IG) loci are the classic drivers in a large subset of B-cell lymphomas. Detection of these translocations is important for confirmation of diagnosis and for prognosis and therapy decisions. Currently, molecular diagnosis of translocations in lymphomas is not addressed well by next-generation sequencing (NGS). The standard method for detection of translocations is fluorescence in situ hybridization (FISH), which is labor intensive and can be difficult to interpret. There is a need for a robust technology that can be standardized. Targeted Locus Capture (TLC) selectively enriches and sequences entire genes based on the crosslinking of physically proximal sequences, and thereby enables complete sequencing of genes of interest, including detection of large structural variants. Because the technology is based on the crosslinking and fragmenting of DNA, it has particular advantages in the analysis of formalin-fixed, paraffin-embedded (FFPE) samples in which DNA is inherently crosslinked and fragmented. In order to validate the FFPE-TLC technology as a novel approach for translocation detection in lymphoma samples, we have developed a panel assay containing genes with frequent translocations (MYC, BCL2, BCL6, IG loci). With this assay we have analyzed >140 lymphoma and control FFPE samples of variable input amounts and qualities that had previously been analyzed with FISH, and a subset also with standard targeted NGS. Good concordance with FISH results was observed for both translocation-positive and -negative samples. In 10 cases for which FFPE-TLC analysis resulted in a different finding than FISH, discordance could be explained by higher sensitivity of FFPE-TLC or by inconclusive FISH results. In a specific case, FFPE-TLC detected a small-distance rearrangement on chromosome 3 that caused a BCL6 fusion but led to insufficient and therefore undetectable break-apart with FISH. Secondly, the FFPE-TLC approach was tested on a set of 19 B-cell lymphoma FFPE samples that had previously been analyzed using standard targeted NGS and FISH and was enriched for discordant results between these methods. FFPE-TLC-based NGS enables more robust translocation calling as the detection relies on broad sequencing coverage across the translocation partner rather than on breakpoint sequences only. In 3 cases, FFPE-TLC could proof false negative calls in standard targeted NGS due to breakpoints located in regions difficult to capture or to sequence. In 1 case, standard targeted NGS had made a false positive call on a breakpoint sequence that was shown to be caused by a small insertion rather than a genuine translocation. This study shows that FFPE-TLC promises to be a robust alternative for FISH analysis and standard targeted NGS procedures in lymphoma diagnostics and in other cancers with frequent structural variants. The FFPE-TLC approach enables a single, DNA-based NGS test detecting both small mutations and translocations. Citation Format: Amin Allahyar, Mark Pieterse, Joost Swennenhuis, Tjitske Los-de Vries, Mehmet Yilmaz, Roos Leguit, Ruud Meijers, Nathalie Hijmering, Daphne de Jong, Bauke Ylstra, Robert van der Geize, Joost Vermaat, Arjen Cleven, Tom van Wezel, Arjan Diepstra, Leon van Kempen, Karima Hajo, Harma Feitsma, Marieke Simonis, Max van Min, Erik Splinter, Wouter de Laat. Robust detection of translocations in lymphoma FFPE samples using Targeted Locus Capture-based sequencing [abstract]. In: Proceedings of the AACR Virtual Meeting: Advances in Malignant Lymphoma; 2020 Aug 17-19. Philadelphia (PA): AACR; Blood Cancer Discov 2020;1(3_Suppl):Abstract nr PO-45.
- Published
- 2020
18. A data-driven interactome of synergistic genes improves network based cancer outcome prediction
- Author
-
Joske Ubels, Jeroen de Ridder, and Amin Allahyar
- Subjects
Breast cancer ,Computer science ,Gene regulatory network ,Stability (learning theory) ,medicine ,Computational biology ,medicine.disease ,Interactome ,Biological network ,Data-driven - Abstract
Robustly predicting outcome for cancer patients from gene expression is an important challenge on the road to better personalized treatment. Network-based outcome predictors (NOPs), which considers the cellular wiring diagram in the classification, hold much promise to improve performance, stability and interpretability of identified marker genes. Problematically, reports on the efficacy of NOPs are conflicting and for instance suggest that utilizing random networks performs on par to networks that describe biologically relevant interactions. In this paper we turn the prediction problem around: instead of using a given biological network in the NOP, we aim to identify the network of genes that truly improves outcome prediction. To this end, we propose SyNet, a gene network constructed ab initio from synergistic gene pairs derived from survival-labelled gene expression data. To obtain SyNet, we evaluate synergy for all 69 million pairwise combinations of genes resulting in a network that is specific to the dataset and phenotype under study and can be used to in a NOP model. We evaluated SyNet and 11 other networks on a compendium dataset of >4000 survival-labelled breast cancer samples. For this purpose, we used cross-study validation which more closely emulates real world application of these outcome predictors. We find that SyNet is the only network that truly improves performance, stability and interpretability in several existing NOPs. We show that SyNet overlaps significantly with existing gene networks, and can be confidently predicted (~85% AUC) from graph-topological descriptions of these networks, in particular the breast tissue-specific network. Due to its data-driven nature, SyNet is not biased to well-studied genes and thus facilitates post-hoc interpretation. We find that SyNet is highly enriched for known breast cancer genes and genes related to e.g. histological grade and tamoxifen resistance, suggestive of a role in determining breast cancer outcome.Author SummaryCancer is caused by disrupted activity of several pathways. Therefore, outcome predictors analyze patient’s expression profiles from perspective of gene groups collected from interactomes (e.g. protein interaction networks). These Network based Outcome Predictors (NOPs) hold potential to facilitate identification of dysregulated pathways and delivering improved prognosis. Nonetheless, recent studies revealed that compared to classical models, neither performance nor consistency can be improved using NOPs.We argue that NOPs can only perform well under guidance of suitable networks. The commonly used networks may miss associations specially for under-studied genes. Additionally, these networks are often generic with low resemblance to perturbations that arise in cancer.To address this issue, we exploit ~4100 samples and infer a disease specific network called SyNet linking synergistic gene pairs that collectively show predictivity beyond individual performance of genes.Using identical datasets, we show that a NOP yields superior performance merely by considering groups of genes in SyNet. Further, NOP performance severely reduces if SyNet nodes are shuffled, confirming relevance of SyNet links.Due to simplicity of our approach, this framework can be used for any phenotype of interest. Our findings represent the value of network-based models and crucial role of interactome in their performance.
- Published
- 2018
- Full Text
- View/download PDF
19. Constrained Semi-Supervised Growing Self-Organizing Map
- Author
-
Ahad Harati, Hadi Sadoghi Yazdi, and Amin Allahyar
- Subjects
Fuzzy clustering ,business.industry ,Computer science ,Cognitive Neuroscience ,Correlation clustering ,Constrained clustering ,Conceptual clustering ,Machine learning ,computer.software_genre ,Growing self-organizing map ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,Data stream clustering ,Artificial Intelligence ,CURE data clustering algorithm ,Canopy clustering algorithm ,Artificial intelligence ,Data mining ,business ,Cluster analysis ,computer - Abstract
Semi-supervised clustering tries to surpass the limits of unsupervised clustering using extra information contained in occasional labeled data points. However, providing such labeled samples is not always possible or easy in real world applications. A weaker, yet still very useful option is providing constraints on the unlabeled training samples, which is the focus of the Constrained Semi-Supervised (CSS) clustering. On the other hand, online learning has gained considerable amount of interests in real world problems with massive sample size or streaming behavior, as lack of memory and computational resources seriously restrict the application of the offline and batch methods. However, the existing algorithms for online CSS clustering problem either assumed that the entire dataset is available and added constraints incrementally or considered chunks of constrained data points and applied an offline CSS clustering algorithm. Thus, none of them can be categorized as a genuine online CSS clustering algorithm. In this paper, we propose CS2GS, an online CSS clustering algorithm. CS2GS is constructed by modifying the online learning process of Semi-Supervised Growing Self-Organizing Map, and converting it to an iterative constrained metric learning problem that can be solved using the Bregman׳s iterative projections. The proposed CS2GS is studied via a series of thorough tests using synthetic and real data including selections from UCI datasets and FEP – a recent bilingual corpus used for sentence aligning stage of machine translation. Experimental results show the effectiveness of CS2GS in online CSS clustering, and prove that indeed, the limits of the system accuracy may be pushed higher using unlabeled samples.
- Published
- 2015
20. Online discriminative component analysis feature extraction from stream data with domain knowledge
- Author
-
Hadi Sadoghi Yazdi and Amin Allahyar
- Subjects
Concept drift ,Computer science ,business.industry ,Feature extraction ,Pattern recognition ,Linear discriminant analysis ,Theoretical Computer Science ,ComputingMethodologies_PATTERNRECOGNITION ,Data point ,Component analysis ,Discriminative model ,Artificial Intelligence ,Scatter matrix ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,Time complexity - Abstract
In this paper, we introduce an incremental version of recently proposed constrained Linear Discriminant Analysis (LDA). In addition of application in constrained LDA problems, our algorithm which we call Online Discriminative Component Analysis (ODCA) is usable in standard incremental LDA problems. ODCA incrementally computes the solution of LDA with the time complexity lower than most incremental algorithm for LDA while keeps the accuracy of final result as close as possible to offline version. This is done using a special formulation for the scatter matrix updating along with Eigen-space calculation. By exploiting such formulation, the proposed algorithm made capable of updating the solution where a data point added or removed from the problem. It is also usable in problems where its data points have concept drift property. To show efficiency of proposed algorithm, its speed is compared to other existing incremental algorithms as order of complexity. In addition, the classification accuracy of our approach is experimentally compared to other algorithms.
- Published
- 2014
21. Artificial Immune Linear Discriminant Analysis
- Author
-
Hadi Sadoghi Yazdi and Amin Allahyar
- Subjects
symbols.namesake ,Mathematical optimization ,Fitness function ,Distribution (number theory) ,Chromosome (genetic algorithm) ,Fitness approximation ,Gaussian ,Evolutionary algorithm ,symbols ,Linear discriminant analysis ,Fuzzy logic ,Mathematics - Abstract
In Linear Discriminant Analysis (LDA), it is assumed that each class has a Gaussian distribution. This assumption rarely holds in the real world problems. However, by removing this assumption, the problem become intractable and cannot be solved in analytic form. Quite recently, a group of evolutionary algorithms is introduced to solve this problem. These algorithms used a combination of fisher criterion and fuzzy membership function as their fitness function. It is widely acknowledged that computing the fitness function in an evolutionary algorithm needs to be very fast. Unfortunately, calculating fisher criterion for each chromosome in iterations of an evolutionary algorithm has a high computational cost. Furthermore it is known that the fuzzy membership function has an assumption of Gaussian distribution, thus using it as a fitness function will have same assumption issue that LDA had previously. In this paper, we suggest a new fisher criterion to incorporate in fitness function and show that it is theoretically faster than previous introduced criterion. In addition we theoretically prove the equality of proposed criterion. Next, in order to eliminate the Gaussian assumption, we offer a substitution for fuzzy membership fitness function which does not have Gaussian assumption. Moreover, the superior speed introduced fitness function theoretically investigated. At last, in order to confirm the effectiveness of proposed fitness functions, comprehensive experiments using twelve UCI repository dataset and two real world problems in face and object recognition is performed and the results is compared in both speed and accuracy.
- Published
- 2013
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.