75 results on '"Çiçek, A. Ercüment"'
Search Results
2. Robust inference of kinase activity using functional networks
- Author
-
Yılmaz, Serhan, Ayati, Marzieh, Schlatzer, Daniela, Çiçek, A. Ercüment, Chance, Mark R., and Koyutürk, Mehmet
- Published
- 2021
- Full Text
- View/download PDF
3. ÉCOLE: Learning to call copy number variants on whole exome sequencing data
- Author
-
Mandiracioglu, Berk, primary, Özden, Furkan, additional, Alkan, Can, additional, and Çiçek, A. Ercüment, additional
- Published
- 2022
- Full Text
- View/download PDF
4. UnSplit
- Author
-
Erdoğan, Ege, primary, Küpçü, Alptekin, additional, and Çiçek, A. Ercüment, additional
- Published
- 2022
- Full Text
- View/download PDF
5. Polishing copy number variant calls on exome sequencing data via deep learning
- Author
-
Özden, Furkan, primary, Alkan, Can, additional, and Çiçek, A. Ercüment, additional
- Published
- 2022
- Full Text
- View/download PDF
6. Uncovering complementary sets of variants for predicting quantitative phenotypes
- Author
-
Yilmaz, Serhan, primary, Fakhouri, Mohamad, additional, Koyutürk, Mehmet, additional, Çiçek, A Ercüment, additional, and Tastan, Oznur, additional
- Published
- 2021
- Full Text
- View/download PDF
7. Genetic circuits combined with machine learning provides fast responding living sensors
- Author
-
Saltepe, Behide, primary, Bozkurt, Eray Ulaş, additional, Güngen, Murat Alp, additional, Çiçek, A. Ercüment, additional, and Şeker, Urartu Özgür Şafak, additional
- Published
- 2021
- Full Text
- View/download PDF
8. Potpourri : An Epistasis Test Prioritization Algorithm via Diverse SNP Selection
- Author
-
Çaylak, Gizem, Çiçek, A. Ercüment, Çaylak, Gizem, and Çiçek, A. Ercüment
- Abstract
Genome-wide association studies explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help closing this gap. Unfortunately, sheer number of loci combinations to process and hypotheses to test prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely-epistatic SNP pairs to limit the number of tests. Yet, they still suffer from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location, leads to better phenotype prediction due to genetic complementation. Here, we propose that an algorithm that pairs SNPs from such diverse regions and ranks them can improve prediction power. We propose an epistasis test prioritization algorithm which optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state-of-the-art on three GWAS and show that (i) we substantially improve precision (from 0.003 to 0.652) while maintaining the significance of selected pairs, (ii) decrease the number of tests by 25 folds, and (iii) decrease the runtime by 4 folds. We also show that promoting SNPs from regulatory/coding regions improves the performance (up to 0.8). Potpourri is available at http:/ciceklab.cs.bilkent.edu.tr/potpourri., Part of proceedings ISBN 978-3-030-45256-8 978-3-030-45257-5QC 20211207
- Published
- 2020
- Full Text
- View/download PDF
9. Revisiting the complex architecture of ALS in Turkey: expanding genotypes, shared phenotypes, molecular networks, and a public variant database
- Author
-
Tunca, Ceren; Bayraktar, Elif; Palvadeau, Robin; Oflazer, Piraye; Başak, Ayşe Nazlı (ORCID 0000-0001-9257-3540 & YÖK ID 1512), Şeker, Tuncay; Akçimen, Fulya; Coşkun, Cemre; Zor, Seyit; Kocoğlu, Cemile; Kartal, Ece; Şen, Nesli Ece; Hamzeiy, Hamid; Erimiş, Aslıhan Özoğuz; Norman, Utku; Karakahya, Oğuzhan; Olgun, Gülden; Akgün, Tahsin; Durmuş, Hacer; Şahin, Erdi; Çakar, Arman; Gürsoy, Esra Baar; Yıldız, Gülşen Babacan; İsak, Barış; Uluç, Kayıhan; Hanağası, Haşmet; Bilgiç, Başar; Turgut, Nilda; Aysal, Fikret; Ertaş, Mustafa; Boz, Cavit; Kotan, Dilcan; İdrisoğlu, Halil; Soysal, Aysun; Adatepe, Nurten Uzun; Akalın, Mehmet Ali; Koç, Filiz; Tan, Ersin; Deymeer, Feza; Taştan, Öznur; Çiçek, A. Ercüment; Kavak, Erşen; Parman, Yeşim, Koç University Research Center for Translational Medicine (KUTTAM) / Koç Üniversitesi Translasyonel Tıp Araştırma Merkezi (KUTTAM), School of Medicine, Tunca, Ceren; Bayraktar, Elif; Palvadeau, Robin; Oflazer, Piraye; Başak, Ayşe Nazlı (ORCID 0000-0001-9257-3540 & YÖK ID 1512), Şeker, Tuncay; Akçimen, Fulya; Coşkun, Cemre; Zor, Seyit; Kocoğlu, Cemile; Kartal, Ece; Şen, Nesli Ece; Hamzeiy, Hamid; Erimiş, Aslıhan Özoğuz; Norman, Utku; Karakahya, Oğuzhan; Olgun, Gülden; Akgün, Tahsin; Durmuş, Hacer; Şahin, Erdi; Çakar, Arman; Gürsoy, Esra Baar; Yıldız, Gülşen Babacan; İsak, Barış; Uluç, Kayıhan; Hanağası, Haşmet; Bilgiç, Başar; Turgut, Nilda; Aysal, Fikret; Ertaş, Mustafa; Boz, Cavit; Kotan, Dilcan; İdrisoğlu, Halil; Soysal, Aysun; Adatepe, Nurten Uzun; Akalın, Mehmet Ali; Koç, Filiz; Tan, Ersin; Deymeer, Feza; Taştan, Öznur; Çiçek, A. Ercüment; Kavak, Erşen; Parman, Yeşim, Koç University Research Center for Translational Medicine (KUTTAM) / Koç Üniversitesi Translasyonel Tıp Araştırma Merkezi (KUTTAM), and School of Medicine
- Abstract
The last decade has proven that amyotrophic lateral sclerosis (ALS) is clinically and genetically heterogeneous, and that the genetic component in sporadic cases might be stronger than expected. This study investigates 1,200 patients to revisit ALS in the ethnically heterogeneous yet inbred Turkish population. Familial ALS (fALS) accounts for 20% of our cases. The rates of consanguinity are 30% in fALS and 23% in sporadic ALS (sALS). Major ALS genes explained the disease cause in only 35% of fALS, as compared with similar to 70% in Europe and North America. Whole exome sequencing resulted in a discovery rate of 42% (53/127). Whole genome analyses in 623 sALS cases and 142 population controls, sequenced within Project MinE, revealed well-established fALS gene variants, solidifying the concept of incomplete penetrance in ALS. Genome-wide association studies (GWAS) with whole genome sequencing data did not indicate a new risk locus. Coupling GWAS with a coexpression network of disease-associated candidates, points to a significant enrichment for cell cycle- and division-related genes. Within this network, literature text-mining highlightsDECR1, ATL1, HDAC2, GEMIN4, andHNRNPA3as important genes. Finally, information on ALS-related gene variants in the Turkish cohort sequenced within Project MinE was compiled in the GeNDAL variant browser (www.gendal.org)., Scientific and Technological Research Council of Turkey (TÜBİTAK); Bogazici University Research Funds; Suna and İnan Kıraç Foundation
- Published
- 2020
10. Uncovering complementary sets of variants for predicting quantitative phenotypes
- Author
-
Yılmaz, Serhan, primary, Fakhouri, Mohamad, additional, Koyutürk, Mehmet, additional, Çiçek, A. Ercüment, additional, and Taştan, Öznur, additional
- Published
- 2020
- Full Text
- View/download PDF
11. Genetic Circuits Combined with Machine Learning Provides Fast Responding Living Sensors
- Author
-
Saltepe, Behide, primary, Bozkurt, Eray Ulaş, additional, Güngen, Murat Alp, additional, Çiçek, A. Ercüment, additional, and Şeker, Urartu Özgür Şafak, additional
- Published
- 2020
- Full Text
- View/download PDF
12. Cover, Volume 41, Issue 8
- Author
-
Tunca, Ceren, primary, Şeker, Tuncay, additional, Akçimen, Fulya, additional, Coşkun, Cemre, additional, Bayraktar, Elif, additional, Palvadeau, Robin, additional, Zor, Seyit, additional, Koçoğlu, Cemile, additional, Kartal, Ece, additional, Şen, Nesli Ece, additional, Hamzeiy, Hamid, additional, Özoğuz Erimiş, Aslıhan, additional, Norman, Utku, additional, Karakahya, Oğuzhan, additional, Olgun, Gülden, additional, Akgün, Tahsin, additional, Durmuş, Hacer, additional, Şahin, Erdi, additional, Çakar, Arman, additional, Başar Gürsoy, Esra, additional, Babacan Yıldız, Gülsen, additional, İşak, Barış, additional, Uluç, Kayıhan, additional, Hanağası, Haşmet, additional, Bilgiç, Başar, additional, Turgut, Nilda, additional, Aysal, Fikret, additional, Ertaş, Mustafa, additional, Boz, Cavit, additional, Kotan, Dilcan, additional, İdrisoğlu, Halil, additional, Soysal, Aysun, additional, Uzun Adatepe, Nurten, additional, Akalın, Mehmet Ali, additional, Koç, Filiz, additional, Tan, Ersin, additional, Oflazer, Piraye, additional, Deymeer, Feza, additional, Taştan, Öznur, additional, Çiçek, A. Ercüment, additional, Kavak, Erşen, additional, Parman, Yeşim, additional, and Başak, A. Nazlı, additional
- Published
- 2020
- Full Text
- View/download PDF
13. Revisiting the complex architecture of ALS in Turkey: Expanding genotypes, shared phenotypes, molecular networks, and a public variant database
- Author
-
Tunca, Ceren, primary, Şeker, Tuncay, additional, Akçimen, Fulya, additional, Coşkun, Cemre, additional, Bayraktar, Elif, additional, Palvadeau, Robin, additional, Zor, Seyit, additional, Koçoğlu, Cemile, additional, Kartal, Ece, additional, Şen, Nesli Ece, additional, Hamzeiy, Hamid, additional, Özoğuz Erimiş, Aslıhan, additional, Norman, Utku, additional, Karakahya, Oğuzhan, additional, Olgun, Gülden, additional, Akgün, Tahsin, additional, Durmuş, Hacer, additional, Şahin, Erdi, additional, Çakar, Arman, additional, Başar Gürsoy, Esra, additional, Babacan Yıldız, Gülsen, additional, İşak, Barış, additional, Uluç, Kayıhan, additional, Hanağası, Haşmet, additional, Bilgiç, Başar, additional, Turgut, Nilda, additional, Aysal, Fikret, additional, Ertaş, Mustafa, additional, Boz, Cavit, additional, Kotan, Dilcan, additional, İdrisoğlu, Halil, additional, Soysal, Aysun, additional, Uzun Adatepe, Nurten, additional, Akalın, Mehmet Ali, additional, Koç, Filiz, additional, Tan, Ersin, additional, Oflazer, Piraye, additional, Deymeer, Feza, additional, Taştan, Öznur, additional, Çiçek, A. Ercüment, additional, Kavak, Erşen, additional, Parman, Yeşim, additional, and Başak, A. Nazlı, additional
- Published
- 2020
- Full Text
- View/download PDF
14. Polishing Copy Number Variant Calls on Exome Sequencing Data via Deep Learning
- Author
-
Özden, Furkan, primary, Alkan, Can, additional, and Çiçek, A. Ercüment, additional
- Published
- 2020
- Full Text
- View/download PDF
15. Robust Inference of Kinase Activity Using Functional Networks
- Author
-
Yılmaz, Serhan, primary, Ayati, Marzieh, additional, Schlatzer, Daniela, additional, Çiçek, A. Ercüment, additional, Chance, Mark R., additional, and Koyutürk, Mehmet, additional
- Published
- 2020
- Full Text
- View/download PDF
16. Uncovering complementary sets of variants for predicting quantitative phenotypes.
- Author
-
Yilmaz, Serhan, Fakhouri, Mohamad, Koyutürk, Mehmet, Çiçek, A Ercüment, and Tastan, Oznur
- Subjects
PHENOTYPES ,GENOME-wide association studies ,HERITABILITY ,HUMAN genome ,FEATURE selection ,LINKAGE disequilibrium - Abstract
Motivation Genome-wide association studies show that variants in individual genomic loci alone are not sufficient to explain the heritability of complex, quantitative phenotypes. Many computational methods have been developed to address this issue by considering subsets of loci that can collectively predict the phenotype. This problem can be considered a challenging instance of feature selection in which the number of dimensions (loci that are screened) is much larger than the number of samples. While currently available methods can achieve decent phenotype prediction performance, they either do not scale to large datasets or have parameters that require extensive tuning. Results We propose a fast and simple algorithm, Macarons, to select a small, complementary subset of variants by avoiding redundant pairs that are likely to be in linkage disequilibrium. Our method features two interpretable parameters that control the time/performance trade-off without requiring parameter tuning. In our computational experiments, we show that Macarons consistently achieves similar or better prediction performance than state-of-the-art selection methods while having a simpler premise and being at least two orders of magnitude faster. Overall, Macarons can seamlessly scale to the human genome with ∼ 10 7 variants in a matter of minutes while taking the dependencies between the variants into account. Availabilityand implementation Macarons is available in Matlab and Python at https://github.com/serhan-yilmaz/macarons. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
17. Deep convolutional neural networks for PET super-resolution
- Author
-
Colliot, Olivier, Mitra, Jhimli, Özaltan, Kaan, Türkölmez, Emir, Namer, I. Jacques, Çiçek, A. Ercüment, and Aksoy, Selim
- Published
- 2024
- Full Text
- View/download PDF
18. k-Shell decomposition reveals structural properties of the gene coexpression network for neurodevelopment
- Author
-
ÇİÇEK, A. Ercüment, primary
- Published
- 2017
- Full Text
- View/download PDF
19. Ensuring location diversity in privacy preserving spatio-temporal data mining
- Author
-
Çiçek, Abdullah Ercüment, Saygın, Yücel, Nergiz, Mehmet Ercan, and Bilgisayar Bilimleri ve Mühendisliği Anabilim Dalı
- Subjects
Data mining ,Computer Engineering and Computer Science and Control ,Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol - Abstract
Son yıllarda, seyyar teknolojilerin yükselişi, büyük miktarlarda kişisel mekan bilgisinin ortaya çıkmasına yol açtı. Bilgi keşfi noktasından bakıldığında, ticari değer içerdiği için çok değerli olan bu veri, yapısında var olan, kişisel bilgiler nedeniyle gizlilik çekincelerini ortaya çıkardı. Literatürde kişilerin gizlilik gereksinimlerini genelleme, bozma ve baskılama metotlarıyla karşılamayı amaçlayan bir çok algoritma bulunmakta. Bu tarzdaki, kullanıcılar arasında belirli bir ayrılamazlık seviyesi yakalamaya çalışan algoritmalar, kullanıcıların ziyaret ettiği mekanlar arasında yeterli çeşitlilik olmadığında başarısız olmaktadırlar.Bu çalışmada mekan çeşitliliğini sağlayan bir yöntem önerilmektedir. (c,p)-gizliliği adı verilen yöntem, kullanıcıların hassas mekanları ziyaret etme olasılığını saldırganın arka plan bilgisine göre sınırlamaktadır. Bu yöntem rotaları anonimleştirmek yerine, altta yatan haritayı anonimleştirmektedir. Çalışmada algoritma açıklamasının yanı sıra, yaklaşımımızın başarımı da gösterilmektedir. Aynı zamanda algoritmamızın başarımı var olan bir teknik ile karşılaştırılmakta ve mekan çeşitliliğinin verimli bir şekilde sağlanabildiği ortaya konmaktadır. The rise of mobile technologies in the last decade has lead to vast amounts of location information generated by individuals. From the knowledge discovery point of view, this data is quite valuable as it has commercial value, but the inherent personal information in the data raises privacy concerns. There exist many algorithms in the literature to satisfy the privacy requirements of individuals, by generalizing, perturbing, and suppressing data. The algorithms that try to ensure a level of indistinguishability between trajectories in the dataset, fail when there is not enough diversity among sensitive locations visited by those users.We propose an approach that ensures location diversity named as (c,p)-confidentiality, which bounds the probability of visiting a sensitive location given the background knowledge of the adversary. Instead of grouping the trajectories, we anonymize the underlying map structure. We explain our algorithm and show the performance of our approach. We also compare the performance of our algorithm with an existing technique and show that location diversity can be satisfied efficiently. 78
- Published
- 2009
20. OPERATIONAL VARIABLE JOB SCHEDULING WITH ELIGIBILITY CONSTRAINTS: A RANDOMIZED CONSTRAINT‐GRAPH‐BASED APPROACH
- Author
-
Eliiyi, Deniz Türsel, primary, Korkmaz, Aslıhan Gizem, additional, and Çiçek, Abdullah Ercüment, additional
- Published
- 2009
- Full Text
- View/download PDF
21. Graph embeddings on protein interaction networks
- Author
-
Kuru, Halil İbrahim, Çiçek, A. Ercüment, Çiçek, Abdullah Ercüment, Taştan Okan, Öznur, and Bilgisayar Mühendisliği Anabilim Dalı
- Subjects
Survival prediction ,Protein-protein interaction network ,Graph representations ,Node embeddings ,Computer Engineering and Computer Science and Control ,Gene essentiality ,Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol ,Network topological features ,Cancer - Abstract
Protein-protein etkileşimi (PPE) ağları, proteinleri ve dolayısı ile onları kodlayan genler arasındaki olası etkileşimler kümesini temsil eder. Mutasyonlar veya değişken ifade örüntüleri gibi tek tek genlerden gelen sinyalleri entegre edilmesini olanaklı kılarak PPE ağları günüze dek çeşitli biyolojik keşiflere vesile olmuştur. Ayrıca, bu tür ağlardaki proteinlerin bağlantı örüntülerinin, proteinleri veya genleri içeren çeşitli tahmin görevleri için oldukça bilgilendirici olduğu kanıtlanmıştır. Ancak, bu görevler göreve özel öznitelik mühendisliği gerektirmektedir. Ağdaki düğümlerin derin bir gösterimini öğrenen çizge gömülüm teknikleri, bu konuda güçlü bir alternatif sağlamakta ve söz konusu ağ için duyulan kapsamlı öznitelik mühendisliği ihtiyacını ortadan kaldırmaktadır. Bu çalışmada, biz çizge gömülme tekniklerini iki bağımsız makine öğrenmesi görevinde kullanıyoruz. Mevcut çalışmanın ilk kısmı, gen esaslılığını tahmin etmeye odaklanıyor. Bu bölümde, iki farklı düğüm gömülme tekniği, node2vec ve DeepWalk kullanarak, girdi olarak yalnızca düğüm gömülme kullanıldığında, insan genlerinin gerekliliğini tahmin etmede % 88'e varan AUC alabileceğini gösteriyoruz.Tezin ikinci kısmı, protein ifade değerlerinin çiftli sıralamaları ve protein etkileşimlerine dayalı, açılımını PRER olarak kısalttığımız özgün bir hasta gösterimi önermektedir. Daha spesifik olarak, proteinlerin ifade değerlerini kullanıyor ve bir proteinin kendi komşuluk bölgesindeki diğer proteinlerle nispi ifadesini temsil eden hastaya özgü bir gen gömülmesi üretiyoruz. Komşuluk bölgesi PPE ağında yanlı rastgele yürüme stratejisi kullanılarak türetiliyor. öncelikle, belirli bir proteinin spesifik bir tümör için komşuluk bölgesindeki diğer proteinlere kıyasla daha az veya daha fazla ifade edilip edilmediğini kontrol ediyoruz. Buna dayanarak, sadece proteinler arasındaki düzensizlik örüntülerini yakalayan değil, aynı zamanda moleküler etkileşimleri de hesaba katan bir gösterim üretiyoruz. Bu gösterimin etkinliğini test etmek için, PRER'i hasta sağkalım tahmin problemi için kullanıyoruz. Hastaların bireysel protein ifade özellikleriyle gösterimine kıyasla, PRER gösterimi 10 kanser türünden 8'inde istatiski olarak anlamlı bir şekilde üstün tahmin performansı gösteriyor. Bireysel ifade değerlerinin aksine PRER'de önemli olarak ortaya çıkan proteinler, yüksek prognostik değeri olan değerli bir biyobelirteç seti sağlıyor. Ek olarak, düzensizlik desenleri için daha fazla araştırılması gereken diğer proteinleri de vurguluyor. Protein-protein interaction (PPI) networks represent the possible set of interactions among proteins and thereby the genes that code for them. By integrating isolated signals on single genes such as mutations or differential expression patterns, PPI networks have enabled various biological discoveries so far. Furthermore, even the connectivity patterns of proteins in such networks have been proven to be highly informative for various prediction tasks involving proteins or genes. These tasks; however, require task specific feature engineering. Graph embedding techniques that learn a deep representation of the nodes on the network, provides a powerful alternative and obviate the need for this extensive feature engineering on the network. In this study we use graph embedding techniques on PPI networks in two independent machine learning tasks. The first part of the present work focuses on predicting gene essentiality. Using two different node embedding techniques, node2vec and DeepWalk, we present a classifier which only uses node embeddings as input and show that it can achieve up to 88 % AUC score in predicting human gene essentiality. The second part of the thesis proposes a novel representation of patients based on pairwise rank order of patient protein expression values and protein interactions, which we abbreviate as PRER. Specifically, we use the protein expression values of proteins, and generate a patient specific gene embedding to represent relative expression of a protein with other proteins in the neighborhood of that protein. The neighborhood is derived using a biased random-walk strategy. We first check whether a given protein is less or more expressed compared to the other proteins in their neighborhood for a specific tumor. Based on this we generate a representation that not only captures the dysregulation patterns among the proteins but also accounts for the molecular interactions. To test the effectiveness of this representation, we use PRER for the problem of patient survival prediction. When compared against the representation of patients with their individual protein expression features, PRER representation demonstrates significantly superior predictive performance in 8 out of 10 cancer types. Proteins that emerge as important in the PRER as opposed to individual expression values provide a valuable set of biomarkers with high prognostic value. Additionally, they highlight other proteins that should be further investigated for the dysregulation patterns. 110
- Published
- 2019
22. Diverse SNP selection for epistasis test prioritization
- Author
-
Çaylak, Gizem, Çiçek, Abdullah Ercüment, Bilgisayar Mühendisliği Anabilim Dalı, and Çiçek, A. Ercüment
- Subjects
SNP selection ,GWAS ,Epistasis test prioritization ,Computer Engineering and Computer Science and Control ,Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol - Abstract
Genom çapında ilişkilendirme çalışmaları (Genome-Wide Association Studies - GWAS) genetik hastalıkların temelini teşkil eden kalıtsallığın altında yatan sebeplerin sadece bir kısmını açıklayabilmektedir. İki ya da daha fazla lokusun arasındaki epistatik etkileşimler açıklama gücündeki boşluğu kapatmaya yardımcı olduğu gibi kompleks etkileşimleri de tespit ederek kompleks karakterlerin daha iyi çözümlenebilmesi için gelecek vaat etmektedir. Fakat değerlendirilmesi ve hipotez için test edilmesi gereken çok sayıdaki lokus kombinasyonları, hem algoritma karmaşıklığı hem de istatiksel olarak çalışmaları engellemektedir. Sadece ikili etkileşimler göz önüne alındığında dahi bu durum düzelmemektedir. Epistasis önceliklendirme algoritmalarının hem hesaplama yükünü hem de yapılması gereken test sayısını azalttığı kanıtlanmıştır. Güncel metotlar bağlantı dengesizliğinden kaçınmayı ve vaka kohortunu kapsamayı amaçlasa da, metotların hiçbiri seçilen lokusların topolojik düzenini çeşitlendirmeyi amaçlamamıştır. Bu tezde, epistatik testleri önceliklendirmek için iki aşamalı ardışık düzen algoritması önerilmiştir. İlk aşamada çeşitli lokusları seçmek için altmodüler bir fonksiyon optimize edilmiştir.Bu aşama (i) bağlantı dengesizliğinden kaçınmayı ve (ii) birbirini fonksiyonel olarak tamamlayan lokus ikilileri seçmeyi amaçlamaktadır. İkinci aşamada, seçilen lokuslar hızlı epistatik etkileşim tespit eden bir algoritmada girdi olarak kullanılmıştır. Deneylerimizde, metot modern yöntemlerden biri olan LinDen ile Wellcome Trust Case Control Consortium'dan alınan tip 2 diyabet, hipertansiyon, bipolar bozukluk olmak üzere üç veriseti üzerinde karşılaştırılmıştır. Sonuçlar göstermektedir ki epistatik çiftleri bulmak için yapılan testlerin sayısında önemli bir düşüş gözlenirken aynı zamanda keşfedilen istatiksel olarak önemli epistatik çift sayısı da artmıştır. Genome-wide association studies explain a fraction of the underlying heritability of genetic diseases. Epistatic interactions between two or more loci help closing the gap and identifying those complex interactions provides a promising road to a better understanding of complex traits. Unfortunately, sheer number of loci combinations to consider and hypotheses to test prohibit the process both computationally and statistically. This is true even if only pairs of loci are considered. Epistasis prioritization algorithms have proven useful for reducing the computational burden and limiting the number of tests to perform. While current methods aim at avoiding linkage disequilibrium and covering the case cohort, none aims at diversifying the topological layout of the selected SNPs which can detect complementary variants.In this thesis, a two stage pipeline to prioritize epistasis test is proposed. In the first step, a submodular set function is optimized to select a diverse set of SNPs that span the underlying genome to (i) avoid linkage disequilibrium and (ii) pair SNPs that relate to complementary function. In the second step, selected SNPs are used as seeds to a fast epistasis detection algorithm.The algorithm is compared with the state-of-the-art method LinDen on three datasets retrieved from Wellcome Trust Case Control Consortium: type two diabates, hypertension and bipolar disorder. The results show that the pipeline drastically reduces the number of tests to perform while the number of statistically significant epistatic pairs discovered increases. 85
- Published
- 2019
23. SPADIS: Selecting predictive and diverse SNPS in GWAS
- Author
-
Yılmaz, Serhan, Çiçek, Abdullah Ercüment, Taştan Okan, Öznur, Bilgisayar Mühendisliği Anabilim Dalı, and Çiçek, A. Ercüment.
- Subjects
SNP Selection ,Hi-C ,Submodularity ,GWAS ,SNP-SNP Networks ,Computer Engineering and Computer Science and Control ,Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol - Abstract
Genom çapında ilişkilendirme çalışmalarında (Genome-Wide Association Studies - GWAS) saptanan genetik varyasyonlar nadiren tek başlarına karmaşık hastalıkların kalıtsal aktarımını açıklamakta başarılı olabilmektedirler. Şimdiye kadar, fenotiple ilişkili olan varyasyonların bir alt kümesini seçmek amacıyla çeşitli yöntemler geliştirilmiştir. Bu yöntemlerden bazılarında, tekil nükleotit polimorfizmlerini (Single Nucleotide Polymorphism - SNP) bir SNP-SNP ağında bağlı şekilde ödüllendiren bir yaklaşım izlenmiştir. Bu yaklaşımın fenotipi açıklayıcı ve biyolojik anlamda yorumlanabilir SNP'leri bulmakta başarılı sonuçlar elde ettiği de gösterilmiştir. Fakat, bizim hipotezimize göre, ağ üzerinde bağlılık kısıtlaması yapmak benzer biyolojik süreçleri etkileyen, ihtiyaç fazlası SNP'lerin seçimini destekler ve bu da fenotipi açıklama gücünde potansiyel bir kayba sebep olabilir. Bu doğrultudaki çalışmamızda, birbirini tamamlayıcı etkiye sahip olması adına, ağ üzerinde yakın SNP'leri seçmekten kaçınan SPADIS adında yeni bir yöntem sunulmaktadır. SPADIS bu işlevini, altmodüler bir fonksiyonun azami değerine yakınlığını bir sabit çarpan (1-1/e) ile garanti edebilen açgözlü (greedy) bir algoritma ile yerine getirmektedir. SPADIS, deneylerimizde, modern yöntemlerden biri olan SConES ile Arabidopsis Thaliana verisinde karşılaştırılmıştır: Fenotip açıklayabilme ölçütünde ortalama olarak 17 fenotipin 15'inde daha iyi sonuçlar elde edilmekle birlikte, çeşitli ağ ve kurgular arasında istikrarlı gelişmeler de sağlanmıştır. Üstelik, SPADIS'in fenotip ile ilişki daha fazla sayıda gen saptadığı ve çalışmasını daha kısa sürede tamamladığı gösterilmiştir. Ayrıca, deneylerimizde, Hi-C verisinin SNP seçimi problemi çerçevesinde SNP-SNP ağı oluşturmadaki kullanımı incelenmiş ve bunun test edilen tüm yöntemlerin fenotipi açıklamasına katkıda bulunduğu gözlemlenmiştir. Phenotypic heritability of complex traits and diseases is seldom explained by individual genetic variants identfied in genome-wide association studies (GWAS). Many methods have been developed to select a subset of variant loci, which are associated with or predictive of the phenotype. Selecting connected Single Nucleotide Polymorphisms (SNPs) on SNP-SNP networks has been proven successful finding biologically interpretable and predictive SNPs. However, we argue that the connectedness constraint favors selecting redundant features that affect similar biological processes and therefore does not necessarily yield better predictive performance. To this end, we propose a novel method called SPADIS that favors the selection of remotely located SNPs in order to account for their complementary effects in explaining a phenotype. SPADIS selects a diverse set of loci on a SNP-SNP network. This is achieved by maximizing a submodular set function with a greedy algorithm that ensures a constant factor (1 - 1/e) approximation to the optimal solution. We compare SPADIS to the state-of-the-art method SConES, on a dataset of Arabidopsis Thaliana with continuous flowering time phenotypes. SPADIS has better average phenotype prediction performance in 15 out of 17 phenotypes when the same number of SNPs are selected and provides consistent improvements across multiple networks and settings on average. Moreover, it identifies more candidate genes and runs faster. We also investigate the use of Hi-C data to construct a SNP SNP network in the context of SNP selection problem for the first time, which yields improvements in regression performance across all methods. 65
- Published
- 2018
24. Spatio-temporal gene discovery for autism spectrum disorder
- Author
-
Norman, Utku, Çiçek, Abdullah Ercüment, Bilgisayar Mühendisliği Anabilim Dalı, and Çiçek, A. Ercüment
- Subjects
Gene discovery ,Prize-collecting Steiner forest problem ,Autism spectrum disorder ,Spatio-temporal networks ,Computer Engineering and Computer Science and Control ,Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol - Abstract
Otizm Spektrum Bozukluğu'nun (OSB) kalıtsal yapısının karmaşıklığından dolayı Tüm Ekzom Dizileme (Whole Exome Sequencing ya da WES) çalışmaları ile günümüze değin sadece altı düzine kadar risk geni belirlenebilmiştir. Gen keşif sürecini hızlandırabilmek amacıyla ağ temelli birkaç yöntem geliştirilmiştir. Bu yöntemlerin kullandıkları ağlar, durağan türden gen-gen etkileşim ağlarındandır. Gelgelelim, genlerin işlevsel kümelenmeleri sinir sisteminin gelişimiyle evrilir. Ayrıca, gen işleyişlerindeki aksaklıklar kimi zaman sonraki gen-gen etkileşimleri üzerinde katlanarak artan bozulmalara neden olur. Bu nedenle, sinir sistemi gelişiminin değişken ve devingen doğasını göz önünde bulundurmayan yaklaşımlar sınırlı kalacaktır. Çalışmamızda sinir sistemi gelişimi bağlamında evrimleşen gen-gen ortak ifade (coexpression) ağlarının zaman-mekansal bilgisini kullanan ST-Steiner adını verdiğimiz bir gen keşif algoritması sunulmaktadır. Bu algoritma, sinir sistemi gelişimini modelleyecek şekilde uyarlanmış ödül toplayan Steiner ormanı (prize-collecting Steiner forest) temelli bir problemi, öncül sinir-gelişimsel pencerelerdeki bilgiyi taşıyarak, ortak ifade ağlarında çözmektedir. Algoritmanın verdiği kararların izleri geriye doğru sürülebilmekte; bu da sonuçların yorumlanabilirliğini arttırmaktadır. Çalışmamızda ST-Steiner, 3871 örnekten oluşan WES verisine uygulanmakta; erken ve orta cenin dönemlerinin BrainSpan ortak ifade ağlarından risk geni kümeleri belirlenmektedir. Ayrıca, bağımsız bir veri kümesinde, zamansal bilgiyi eklemenin öngörü gücünü arttırdığı gösterilmektedir: Belirlenen kümeler en gelişkin yöntemler (state of the art) ile karşılaştırıldığında hem daha fazla isabet görmekte ,yani daha fazla yıkıcı değişinim (mutasyon) içeren genlerden oluşmakta, hem de OSB ile ilişkili işlevlerde daha çok zenginleşme (enrichment) göstermektedir. Whole Exome Sequencing (WES) studies for Autism Spectrum Disorder (ASD) could identify only around six dozen risk genes to date, because the genetic architecture of the disorder is highly complex. To speed the gene discovery process up, a few network-based ASD gene discovery algorithms were proposed. Although these methods use static gene interaction networks, functional clustering of genes is bound to evolve during neurodevelopment and disruptions are likely to have a cascading effect on the future associations. Thus, approaches that disregard the dynamic nature of neurodevelopment are limited. Here, we present a spatio-temporal gene discovery algorithm for ASD, which leverages information from evolving gene coexpression networks of neurodevelopment. The algorithm solves a prize-collecting Steiner forest based problem on coexpression networks, adapted to model neurodevelopment and transfer information from precursor neurodevelopmental windows. The decisions made by the algorithm can be traced back, adding interpretability to the results. We apply the algorithm on WES data of 3,871 samples and identify risk clusters using BrainSpan coexpression networks of early- and mid-fetal periods. On an independent dataset, we show that incorporation of the temporal dimension increases the predictive power: Predicted clusters are hit more (i.e. they contain genes with more disruptive mutations on them) and show higher enrichment in ASD-related functions compared to the state of the art. 69
- Published
- 2018
25. A unifying network modeling approach for codon optimization
- Author
-
Oya Karaşan, Alper Şen, Banu Tiryaki, A Ercument Cicek, Karaşan, Oya, Şen, Alper, Tiryaki, Banu, and Çiçek, A. Ercüment
- Subjects
Statistics and Probability ,Computational Mathematics ,Computational Theory and Mathematics ,Genetic Code ,Amino Acid Sequence ,Amino Acids ,Codon ,Molecular Biology ,Biochemistry ,Computer Science Applications - Abstract
Motivation Synthesizing genes to be expressed in other organisms is an essential tool in biotechnology. While the many-to-one mapping from codons to amino acids makes the genetic code degenerate, codon usage in a particular organism is not random either. This bias in codon use may have a remarkable effect on the level of gene expression. A number of measures have been developed to quantify a given codon sequence’s strength to express a gene in a host organism. Codon optimization aims to find a codon sequence that will optimize one or more of these measures. Efficient computational approaches are needed since the possible number of codon sequences grows exponentially as the number of amino acids increases. Results We develop a unifying modeling approach for codon optimization. With our mathematical formulations based on graph/network representations of amino acid sequences, any combination of measures can be optimized in the same framework by finding a path satisfying additional limitations in an acyclic layered network. We tested our approach on bi-objectives commonly used in the literature, namely, Codon Pair Bias versus Codon Adaptation Index and Relative Codon Pair Bias versus Relative Codon Bias. However, our framework is general enough to handle any number of objectives concurrently with certain restrictions or preferences on the use of specific nucleotide sequences. We implemented our models using Python’s Gurobi interface and showed the efficacy of our approach even for the largest proteins available. We also provided experimentation showing that highly expressed genes have objective values close to the optimized values in the bi-objective codon design problem. Availability and implementation http://alpersen.bilkent.edu.tr/NetworkCodon.zip. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2022
26. UnSplit: Data-Oblivious model inversion, model stealing, and label inference attacks against split learning
- Author
-
Ege Erdoğan, Alptekin Küpçü, A. Ercüment Çiçek, and Çiçek, A. Ercüment
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Label leakage ,Machine learning ,Split learning ,Model inversion ,Cryptography and Security (cs.CR) ,Data privacy ,Model stealing ,Machine Learning (cs.LG) - Abstract
Training deep neural networks often forces users to work in a distributed or outsourced setting, accompanied with privacy concerns. Split learning aims to address this concern by distributing the model among a client and a server. The scheme supposedly provides privacy, since the server cannot see the clients' models and inputs. We show that this is not true via two novel attacks. (1) We show that an honest-but-curious split learning server, equipped only with the knowledge of the client neural network architecture, can recover the input samples and obtain a functionally similar model to the client model, without being detected. (2) We show that if the client keeps hidden only the output layer of the model to "protect" the private labels, the honest-but-curious server can infer the labels with perfect accuracy. We test our attacks using various benchmark datasets and against proposed privacy-enhancing extensions to split learning. Our results show that plaintext split learning can pose serious risks, ranging from data (input) privacy to intellectual property (model parameters), and provide no more than a false sense of security., Comment: Proceedings of the 21st Workshop on Privacy in the Electronic Society (WPES '22), November 7, 2022, Los Angeles, CA, USA
- Published
- 2022
27. SplitGuard: Detecting and mitigating training-hijacking attacks in split learning
- Author
-
Ege Erdogan, Alptekin Küpçü, A. Ercument Cicek, and Çiçek, A. Ercüment
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Machine learning ,Split learning ,Model inversion ,Cryptography and Security (cs.CR) ,Data privacy ,Machine Learning (cs.LG) - Abstract
Distributed deep learning frameworks such as split learning provide great benefits with regards to the computational cost of training deep neural networks and the privacy-aware utilization of the collective data of a group of data-holders. Split learning, in particular, achieves this goal by dividing a neural network between a client and a server so that the client computes the initial set of layers, and the server computes the rest. However, this method introduces a unique attack vector for a malicious server attempting to steal the client's private data: the server can direct the client model towards learning any task of its choice, e.g. towards outputting easily invertible values. With a concrete example already proposed (Pasquini et al., CCS '21), such training-hijacking attacks present a significant risk for the data privacy of split learning clients. In this paper, we propose SplitGuard, a method by which a split learning client can detect whether it is being targeted by a training-hijacking attack or not. We experimentally evaluate our method's effectiveness, compare it with potential alternatives, and discuss in detail various points related to its use. We conclude that SplitGuard can effectively detect training-hijacking attacks while minimizing the amount of information recovered by the adversaries., Comment: Proceedings of the 21st Workshop on Privacy in the Electronic Society (WPES '22), November 7, 2022, Los Angeles, CA, USA
- Published
- 2022
28. DeepND: Deep multitask learning of gene risk for comorbid neurodevelopmental disorders
- Author
-
Ilayda Beyreli, Oguzhan Karakahya, A. Ercument Cicek, Beyreli, İlayda, Karakahya, Oğuzhan, and Çiçek, A. Ercüment
- Subjects
Graph convolution ,Genome-wide association ,Node classification ,Autism ,Intellectual disability ,General Decision Sciences ,Deep learning ,Development/Pre-production: Data science output has been rolled out/validated across multiple domains/problems [DSML3] ,Semisupervised learning ,Comorbidity ,Development/pre-production - Abstract
Autism spectrum disorder and intellectual disability are comorbid neurodevelopmental disorders with complex genetic architectures. Despite large-scale sequencing studies, only a fraction of the risk genes was identified for both. We present a network-based gene risk prioritization algorithm, DeepND, that performs cross-disorder analysis to improve prediction by exploiting the comorbidity of autism spectrum disorder (ASD) and intellectual disability (ID) via multitask learning. Our model leverages information from human brain gene co-expression networks using graph convolutional networks, learning which spatiotemporal neurodevelopmental windows are important for disorder etiologies and improving the state-of-the-art prediction in single- and cross-disorder settings. DeepND identifies the prefrontal and motor-somatosensory cortex (PFC-MFC) brain region and periods from early- to mid-fetal and from early childhood to young adulthood as the highest neurodevelopmental risk windows for ASD and ID. We investigate ASD- and ID-associated copy-number variation (CNV) regions and report our findings for several susceptibility gene candidates. DeepND can be generalized to analyze any combinations of comorbid disorders. © 2022 The Author(s)
- Published
- 2022
29. DeepSide: A Deep Learning Approach for Drug Side Effect Prediction
- Author
-
Üner, Onur Can, Kuru, Halil İbrahim, Cinbiş, R. Gökberk, Taştan, Öznur, Çiçek, A. Erüment, Üner, Onur Can, Kuru, Halil İbrahim, and Çiçek, A. Ercüment
- Subjects
Applied Mathematics ,Drug side effect prediction ,LINCS ,Genetics ,Deep learning ,Biotechnology - Abstract
Drug failures due to unforeseen adverse effects at clinical trials pose health risks for the participants and lead to substantial financial losses. Side effect prediction algorithms have the potential to guide the drug design process. LINCS L1000 dataset provides a vast resource of cell line gene expression data perturbed by different drugs and creates a knowledge base for context specific features. The state-of-the-art approach that aims at using context specific information relies on only the high-quality experiments in LINCS L1000 and discards a large portion of the experiments. In this study, our goal is to boost the prediction performance by utilizing this data to its full extent. We experiment with 5 deep learning architectures. We find that a multi-modal architecture produces the best predictive performance among multi-layer perceptron-based architectures when drug chemical structure (CS), and the full set of drug perturbed gene expression profiles (GEX) are used as modalities. Overall, we observe that the CS is more informative than the GEX. A convolutional neural network-based model that uses only SMILES string representation of the drugs achieves the best results and provides 13.0% macro-AUC and 3.1% micro-AUC improvements over the state-of-the-art. We also show that the model is able to predict side effect-drug pairs that are reported in the literature but was missing in the ground truth side effect dataset. DeepSide is available at http://github.com/OnurUner/DeepSide .
- Published
- 2022
30. Targeted metabolomics analyses for brain tumor margin assessment during surgery
- Author
-
Doruk Cakmakci, Gun Kaynar, Caroline Bund, Martial Piotto, Francois Proust, Izzie Jacques Namer, A Ercument Cicek, Kaynar, Gün, and Çiçek, A. Ercüment
- Subjects
Statistics and Probability ,Computational Mathematics ,Magnetic Resonance Spectroscopy ,Computational Theory and Mathematics ,Brain Neoplasms ,Humans ,Metabolomics ,Glioma ,Molecular Biology ,Biochemistry ,Magnetic Resonance Imaging ,Computer Science Applications - Abstract
Motivation Identification and removal of micro-scale residual tumor tissue during brain tumor surgery are key for survival in glioma patients. For this goal, High-Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) spectroscopy-based assessment of tumor margins during surgery has been an effective method. However, the time required for metabolite quantification and the need for human experts such as a pathologist to be present during surgery are major bottlenecks of this technique. While machine learning techniques that analyze the NMR spectrum in an untargeted manner (i.e. using the full raw signal) have been shown to effectively automate this feedback mechanism, high dimensional and noisy structure of the NMR signal limits the attained performance. Results In this study, we show that identifying informative regions in the HRMAS NMR spectrum and using them for tumor margin assessment improves the prediction power. We use the spectra normalized with the ERETIC (electronic reference to access in vivo concentrations) method which uses an external reference signal to calibrate the HRMAS NMR spectrum. We train models to predict quantities of metabolites from annotated regions of this spectrum. Using these predictions for tumor margin assessment provides performance improvements up to 4.6% the Area Under the ROC Curve (AUC-ROC) and 2.8% the Area Under the Precision-Recall Curve (AUC-PR). We validate the importance of various tumor biomarkers and identify a novel region between 7.97 ppm and 8.09 ppm as a new candidate for a glioma biomarker. Availability and implementation The code is released at https://github.com/ciceklab/targeted_brain_tumor_margin_assessment. The data underlying this article are available in Zenodo, at https://doi.org/10.5281/zenodo.5781769. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2021
31. AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data
- Author
-
Nathan Lawlor, Daniel N. Conrad, Djamel Nehar-Belaid, Michael L. Stitzel, Asa Thibodeau, Radu Marches, Zev J. Gartner, A. Ercument Cicek, George A. Kuchel, Romy Kursawe, Alper Eroglu, Jacques Banchereau, Duygu Ucar, Christopher S. McGinnis, and Çiçek, A. Ercüment
- Subjects
snATAC-seq ,QH301-705.5 ,Single nucleus ATAC-seq ,Read depth ,Method ,Transposases ,ATAC-seq ,Biology ,QH426-470 ,Multiplets ,Multiplexing ,medicine ,Genetics ,Humans ,Biology (General) ,Multiplet ,Aged ,Likelihood Functions ,Human blood ,Dynamic range ,business.industry ,Pattern recognition ,DNA ,Doublets ,medicine.anatomical_structure ,Leukocytes, Mononuclear ,Chromatin Immunoprecipitation Sequencing ,Artificial intelligence ,business ,Nucleus ,Software - Abstract
Detecting multiplets in single nucleus (sn)ATAC-seq data is challenging due to data sparsity and limited dynamic range. AMULET (ATAC-seq MULtiplet Estimation Tool) enumerates regions with greater than two uniquely aligned reads across the genome to effectively detect multiplets. We evaluate the method by generating snATAC-seq data in the human blood and pancreatic islet samples. AMULET has high precision, estimated via donor-based multiplexing, and high recall, estimated via simulated multiplets, compared to alternatives and identifies multiplets most effectively when a certain read depth of 25K median valid reads per nucleus is achieved. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-021-02469-x.
- Published
- 2021
32. Genomik veri paylaşan beacon sistemlerine genom yeniden inşa saldırıları
- Author
-
Ayöz, Kerem and Çiçek, Abdullah Ercüment
- Subjects
Genomic data-sharing bea-cons ,Privacy ,Genome reconstruction attack ,Genomics - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2021. Includes bibliographical references (leaves 42-50). Sharing genome data in a privacy-preserving way stands as a major bottleneck in front of the scientific progress promised by the big data era in genomics. A community-driven protocol named genomic data-sharing beacon protocol has been widely adopted for sharing genomic data. The system aims to provide a secure, easy to implement, and standardized interface for data sharing by only allowing yes/no queries on the presence of specific alleles in the dataset. However, beacon protocol was recently shown to be vulnerable against membership inference attacks. In this thesis, we show that privacy threats against genomic data sharing beacons are not limited to membership inference. We identify and analyze a novel vulnerability of genomic data-sharing beacons: genome reconstruction. We show that it is possible to successfully reconstruct a substantial part of the genome of a victim when the attacker knows the victim has been added to the beacon in a recent update. In particular, we show how an attacker can use the inherent correlations in the genome and clustering techniques to run such an attack in an ecient and accurate way. We also show that even if multiple individuals are added to the beacon during the same update, it is possible to identify the victim’s genome with high confidence using traits that are easily accessible by the attacker (e.g., eye color or hair type). Moreover, we show how a reconstructed genome using a beacon that is not associated with a sensitive phenotype can be used for membership inference attacks to beacons with sensitive phenotypes (e.g., HIV+). The outcome of this work will guide beacon operators on when and how to update the content of the beacon and help them (along with the beacon participants) make informed decisions. by Kerem Ayöz M.S.
- Published
- 2021
33. Beyin tümörü sınırlarının ameliyat sırasında HRMAS NMR spektroskopisi kullanılarak makine öğrenimi destekli değerlendirilmesi
- Author
-
Çakmakçı, Doruk and Çiçek, A. Ercüment
- Subjects
HRMAS NMR ,Tumor margin detection ,Machine learning ,Feature importance - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2021. Includes bibliographical references (leaves 21-26). Complete resection of the tumor is important for survival in glioma patients. Even if the gross total resection was achieved, left-over micro-scale tissue in the excision cavity risks recurrence. High Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) technique can distinguish healthy and ma-lign tissue efficiently using peak intensities of biomarker metabolites. The method is fast, sensitive and can work with small and unprocessed samples, which makes it a good fit for real-time analysis during surgery. However, only a targeted anal-ysis for the existence of known tumor biomarkers can be made and this requires atechnicianwithchemistrybackground, andapathologistwithknowledgeon tumor metabolism to be present during surgery. Here, we show that we can accu-rately perform this analysis in real-time and can analyze the full spectrum in an untargeted fashion using machine learning. We work on a new and large HRMAS NMR dataset of glioma and control samples (n = 565), which are also labeled with a quantitative pathology analysis. Our results show that a random forest based approach can distinguish samples with tumor cells and controls accurately and effectively with a median AUC of 85.6% and AUPR of 93.4%. We also show that we can further distinguish benign and malignant samples with a median AUC of 87.1% and AUPR of 96.1%. We analyze the feature (peak) importance for classification to interpret the results of the classifier and validate that known malignancy biomarkers such as creatine and 2-hydroxyglutarate play an impor-tant role in distinguishing tumor and normal cells and suggest new biomarker regions. by Doruk Çakmakçı M.S.
- Published
- 2021
34. Derin öğrenme ile ekzom dizileme verilerinde gen kopya sayısı analizlerinin geliştirilmesi
- Author
-
Özden, Furkan and Çiçek, A. Ercüment
- Subjects
Copy number variation ,Whole exome sequencing ,Deep learning - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2021. Includes bibliographical references (leaves 30-35). Accurate and efficient detection of copy number variants (CNVs) is of critical importance due to their significant association with complex genetic diseases. Although algorithms that use whole genome sequencing (WGS) data provide sta-ble results with mostly-valid statistical assumptions, copy number detection on whole exome sequencing (WES) data shows comparatively lower accuracy. This is unfortunate as WES data is cost efficient, compact and is relatively ubiquitous. The bottleneck is primarily due to non-contiguous nature of the targeted capture: biases in targeted genomic hybridization, GC content, targeting probes, and sam-ple batching during sequencing. Here, we present a novel deep learning model, DECoNT, which uses the matched WES and WGS data and learns to correct the copy number variations reported by any off-the-shelf WES-based germline CNV caller. We train DECoNT on the 1000 Genomes Project data, and we show that we can efficiently triple the duplication call precision and double the deletion call precision of the state-of-the-art algorithms. We also show that our model con-sistently improves the performance independent from (i) sequencing technology,(ii) exome capture kit and (iii) CNV caller. Using DECoNT as a universal exome CNV call polisher has the potential to improve the reliability of germline CNV detection on WES data sets. by Furkan Özden M.S.
- Published
- 2021
35. What Does Reduced FDG Uptake Mean in High-Grade Gliomas?
- Author
-
Elisa Ruhland, A. Ercument Cicek, Izzie Jacques Namer, Caroline Bund, François Proust, Benoit Lhermitte, and Çiçek, A. Ercüment
- Subjects
Male ,Mitochondrion ,Malignancy ,Creatine ,Serine ,chemistry.chemical_compound ,Fluorodeoxyglucose F18 ,Glioma ,Metabolomics ,Humans ,Medicine ,Radiology, Nuclear Medicine and imaging ,Avidity ,neoplasms ,Brain Neoplasms ,business.industry ,Glutamate receptor ,Biological Transport ,General Medicine ,Middle Aged ,medicine.disease ,HRMAS-NMR spectroscopy ,chemistry ,Positron-Emission Tomography ,Antifolate ,FDG PET ,Cancer research ,Female ,business - Abstract
Purpose: As well as in many others cancers, FDG uptake is correlated with the degree of malignancy in gliomas, that is, commonly high FDG uptake in high-grade gliomas. However, in clinical practice, it is not uncommon to observe high-grade gliomas with low FDG uptake. Our aim was to explore the tumor metabolism in 2 populations of high-grade gliomas presenting high or low FDG uptake. Methods: High-resolution magic-angle spinning nuclear magnetic resonance spectroscopy was realized on tissue samples from 7 high-grade glioma patients with high FDG uptake and 5 high-grade glioma patients with low FDG uptake. Tumor metabolomics was evaluated from 42 quantified metabolites and compared by network analysis. Results: Whether originating from astrocytes or oligodendrocytes, the highgrade gliomas with low FDG avidity represent a subgroup of high-grade gliomas presenting common characteristics: low aspartate, glutamate, and creatine levels, which are probably related to the impaired electron transport chain in mitochondria; high serine/glycine metabolism and so one-carbon metabolism; low glycerophosphocholine-phosphocholine ratio in membrane metabolism, which is associated with tumor aggressiveness; and finally negative MGMT methylation status. Conclusions: It seems imperative to identify this subgroup of high-grade gliomas with low FDG avidity, which is especially aggressive. Their identification could be important for early detection for a possible personalized treatment, such as antifolate treatment.
- Published
- 2019
36. Detailed modeling of positive selection improves detection of cancer driver genes
- Author
-
A. Ercument Cicek, Xin He, Yi Liu, Chuan He, Pranav Nanga, Matthew Stephens, Nicholas W. Knoblauch, Siming Zhao, Jun Liu, and Çiçek, A. Ercüment
- Subjects
0301 basic medicine ,Mutation rate ,Tumor suppressor gene ,Carcinogenesis ,Computer science ,Science ,General Physics and Astronomy ,02 engineering and technology ,Computational biology ,medicine.disease_cause ,Article ,General Biochemistry, Genetics and Molecular Biology ,03 medical and health sciences ,Mutation Rate ,Neoplasms ,Cancer genomics ,medicine ,Humans ,lcsh:Science ,Gene ,Selection (genetic algorithm) ,Mutation ,Multidisciplinary ,Models, Genetic ,MRNA modification ,Statistical model ,Methyltransferases ,Oncogenes ,General Chemistry ,021001 nanoscience & nanotechnology ,Computational biology and bioinformatics ,030104 developmental biology ,lcsh:Q ,0210 nano-technology ,human activities - Abstract
Identifying driver genes from somatic mutations is a central problem in cancer biology. Existing methods, however, either lack explicit statistical models, or use models based on simplistic assumptions. Here, we present driverMAPS (Model-based Analysis of Positive Selection), a model-based approach to driver gene identification. This method explicitly models positive selection at the single-base level, as well as highly heterogeneous background mutational processes. In particular, the selection model captures elevated mutation rates in functionally important sites using multiple external annotations, and spatial clustering of mutations. Simulations under realistic evolutionary models demonstrate the increased power of driverMAPS over current approaches. Applying driverMAPS to TCGA data of 20 tumor types, we identified 159 new potential driver genes, including the mRNA methyltransferase METTL3-METTL14. We experimentally validated METTL3 as a tumor suppressor gene in bladder cancer, providing support to the important role mRNA modification plays in tumorigenesis., Finding driver genes sheds lights on the biological mechanisms propelling the development of a tumour, and can suggest therapeutic strategies. Here, the authors develop driverMAPS, a model-based approach to identify driver genes, and apply it to TCGA datasets.
- Published
- 2019
37. ST-Steiner: a spatio-temporal gene discovery algorithm
- Author
-
Utku Norman, A. Ercument Cicek, Norman, Utku, and Çiçek, A. Ercüment
- Subjects
Statistics and Probability ,Autism Spectrum Disorder ,Computer science ,Gene regulatory network ,Biochemistry ,03 medical and health sciences ,0302 clinical medicine ,Gene interaction ,medicine ,Code (cryptography) ,Cluster Analysis ,Humans ,Gene Regulatory Networks ,Cluster analysis ,Molecular Biology ,Gene ,Genetic Association Studies ,Exome sequencing ,030304 developmental biology ,Interpretability ,0303 health sciences ,medicine.disease ,Genetic architecture ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,Autism spectrum disorder ,Algorithm ,Algorithms ,Software ,030217 neurology & neurosurgery - Abstract
Motivation Whole exome sequencing (WES) studies for autism spectrum disorder (ASD) could identify only around six dozen risk genes to date because the genetic architecture of the disorder is highly complex. To speed the gene discovery process up, a few network-based ASD gene discovery algorithms were proposed. Although these methods use static gene interaction networks, functional clustering of genes is bound to evolve during neurodevelopment and disruptions are likely to have a cascading effect on the future associations. Thus, approaches that disregard the dynamic nature of neurodevelopment are limited. Results Here, we present a spatio-temporal gene discovery algorithm, which leverages information from evolving gene co-expression networks of neurodevelopment. The algorithm solves a prize-collecting Steiner forest-based problem on co-expression networks, adapted to model neurodevelopment and transfer information from precursor neurodevelopmental windows. The decisions made by the algorithm can be traced back, adding interpretability to the results. We apply the algorithm on ASD WES data of 3871 samples and identify risk clusters using BrainSpan co-expression networks of early- and mid-fetal periods. On an independent dataset, we show that incorporation of the temporal dimension increases the predictive power: predicted clusters are hit more and show higher enrichment in ASD-related functions compared with the state-of-the-art. Availability and implementation The code is available at http://ciceklab.cs.bilkent.edu.tr/st-steiner. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2019
38. SPADIS: An algorithm for selecting predictive and diverse SNPs in GWAS
- Author
-
Serhan Yılmaz, Oznur Tastan, A. Ercument Cicek, Yılmaz, Serhan, and Çiçek, A. Ercüment
- Subjects
Candidate gene ,Computer science ,0206 medical engineering ,Arabidopsis ,Genomics ,Context (language use) ,Single-nucleotide polymorphism ,Locus (genetics) ,Genome-wide association study ,02 engineering and technology ,Computational biology ,Genes, Plant ,Polymorphism, Single Nucleotide ,Submodular set function ,Hi-C ,SNP-SNP networks ,Genetics ,Phenotype prediction ,GWAS ,Submodular function ,Greedy algorithm ,Selection (genetic algorithm) ,Genetic association ,SNP selection ,Applied Mathematics ,Sequence Analysis, DNA ,Heritability ,020602 bioinformatics ,Biological network ,Algorithms ,Biotechnology ,Genome-Wide Association Study - Abstract
Phenotypic heritability of complex traits and diseases is seldom explained by individual genetic variants identified in genome-wide association studies (GWAS). Many methods have been developed to select a subset of variant loci, which are associated with or predictive of the phenotype. Selecting SNPs that are close on a biological network such as SNP-SNP networks have been proven successful in finding biologically interpretable and predictive SNPs. However, we argue that the closeness constraint favors selecting redundant features that affect similar biological processes and therefore does not necessarily yield better predictive performance. An approach, which awards diversity of the selected SNPs and affected functional processes, would boost the predictive power without compromising biological interpretability. In this paper, we propose a novel method called SPADIS that selects a set of loci such that diverse regions in the underlying SNP-SNP network are covered. Instead of enforcing selections based on closeness in the network, SPADIS favors the selection of remotely located SNPs in order to account for the complementary additive effects of SNPs that are associated with the phenotype. This is achieved by maximizing a submodular set function with a greedy algorithm that ensures a constant factor (1 - 1/e) approximation to the optimal solution. We compare SPADIS to the state-of-the-art method SConES, on a dataset of Arabidopsis Thaliana genotype and continuous flowering time phenotypes. SPADIS has better regression performance in 12 out of 17 phenotypes on average, it identifies more candidate genes and runs faster. We also investigate the use of Hi-C data to construct SNP-SNP network in the context of SNP selection problem for the first time, which yields slight improvements in regression performance. SPADIS is available at http://ciceklab.cs.bilkent.edu.tr/spadis.
- Published
- 2021
39. Polishing copy number variant calls on exome sequencing data via deep learning
- Author
-
Furkan Özden, Can Alkan, A. Ercüment Çiçek, Özden, Furkan, Alkan, Can, and Çiçek, A. Ercüment
- Subjects
Whole genome sequencing ,DNA Copy Number Variations ,Computer science ,business.industry ,Deep learning ,High-Throughput Nucleotide Sequencing ,Reproducibility of Results ,Computational biology ,Germline ,Deep Learning ,Exome capture ,Gene duplication ,Exome Sequencing ,Genetics ,Exome ,Artificial intelligence ,Copy-number variation ,1000 Genomes Project ,business ,Exome sequencing ,GC-content ,Genetics (clinical) ,Algorithms - Abstract
Accurate and efficient detection of copy number variants (CNVs) is of critical importance due to their significant association with complex genetic diseases. Although algorithms working on whole genome sequencing (WGS) data provide stable results with mostly-valid statistical assumptions, copy number detection on whole exome sequencing (WES) data has mostly been a losing game with extremely high false discovery rates. This is unfortunate as WES data is cost efficient, compact and is relatively ubiquitous. The bottleneck is primarily due to non-contiguous nature of the targeted capture: biases in targeted genomic hybridization, GC content, targeting probes, and sample batching during sequencing. Here, we present a novel deep learning model, DECoNT , which uses the matched WES and WGS data and learns to correct the copy number variations reported by any over-the-shelf WES-based germline CNV caller. We train DECoNT on the 1000 Genomes Project data, and we show that (i) we can efficiently triple the duplication call precision and double the deletion call precisions of the state-of-the-art algorithms. We also show that model consistently improves the performance in a (i) sequencing technology, (ii) exome capture kit and (iii) CNV caller independent manner. Using DECoNT as a universal exome CNV call polisher has the potential to improve the reliability of germline CNV detection on WES data sets and surge its application. The code and the models are available at https://github.com/ciceklab/DECoNT .
- Published
- 2020
40. PAMOGK-Web: gen kopya sayısı ile kanser alt-tip sınıflandırma platformu
- Author
-
Akdemir, Furkan Mustafa and Çiçek, A. Ercüment
- Subjects
Bioinformatic ,Machine learning ,Software - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2020. Includes bibliographical references (leaves 27-32). Detection of molecular sub-groups of cancer is important for developing cancer therapeutics and to understand the underlying causes of the molecular differences in these groups. The cancer sequencing projects made multi-omics data available for large cancer cohorts. The multi-omics data provides multiple views into the cancer which can be used to find underlying causes from different perspectives and capture relations not possible with a single view approach. Previously, we developed a pipeline that uses multi-omics data to detect sub-groups of patients called PAMOGK. PAMOGK forms multiple views of the patients using pathways and multi-omics data and assess patient similarities under these views. PAMOGK was designed as a general framework that can be used to map many different omics data but was experimented with mutation, transcriptome, and proteome. In this work, we extend the use of PAMOGK with copy number variation data which shows comparable results to experiments without it. As a second contribution, we provide a web framework designed for PAMOGK easier to make it accessible to general users: PAMOGK-Web. This new web based framework is able to abstract the PAMOGK pipeline and provide a simple interface to run experiments and return results to the users. PAMOGK-Web will be using the generic design of PAMOGK to provide ready to use experiments that include setups using different omics data. by Furkan Mustafa Akdemir M.S.
- Published
- 2020
41. Genetic circuits combined with machine learning provides fast responding living sensors
- Author
-
Urartu Ozgur Safak Seker, Behide Saltepe, A. Ercument Cicek, Murat Alp Güngen, Eray Ulas Bozkurt, Saltepe, Behide, Bozkurt, Eray Ulaş, Güngen, Murat Alp, Çiçek, A. Ercüment, and Şeker, Urartu Özgür Şafak
- Subjects
Accuracy and precision ,Analyte ,Computer science ,Circuit design ,Real-time computing ,Biomedical Engineering ,Biophysics ,02 engineering and technology ,Biosensing Techniques ,01 natural sciences ,Machine Learning ,Synthetic biology ,Machine learning ,Electrochemistry ,Gene Regulatory Networks ,Electronic circuit ,Artificial neural network ,010401 analytical chemistry ,Response time ,Whole-cell biosensors ,General Medicine ,021001 nanoscience & nanotechnology ,0104 chemical sciences ,Concentration dependent ,Living sensors ,Assessment methods ,Synthetic Biology ,Neural Networks, Computer ,0210 nano-technology ,Biosensor ,Neural networks ,Biotechnology - Abstract
Whole cell biosensors (WCBs) have become prominent in many fields from environmental analysis to biomedical diagnostics thanks to advanced genetic circuit design principles. Despite increasing demand on cost effective and easy-to-use assessment methods, a considerable amount of WCBs retains certain drawbacks such as long response time, low precision and accuracy. Furthermore, the output signal level does not correspond to a specific analyte concentration value but shows comparative quantification. Here, we utilized a neural network-based architecture to improve the aforementioned features of WCBs and engineered a gold sensing WCB which has a long response time (18 h). Two Long-Short Term-Memory (LSTM)-based networks were integrated to assess both ON/OFF and concentration dependent states of the sensor output, respectively. We demonstrated that binary (ON/OFF) network was able to distinguish between ON/OFF states as early as 30 min with 78% accuracy and over 98% in 3 h. Furthermore, when analyzed in analog manner, we demonstrated that network can classify the raw fluorescence data into pre-defined analyte concentration groups with high precision (82%) in 3 h. This approach can be applied to a wide range of WCBs and improve rapidness, simplicity and accuracy which are the main challenges in synthetic biology enabled biosensing.
- Published
- 2020
42. Multitask learning of gene risk for autism spectrum disorder and intellectual disability
- Author
-
Beyreli, İlayda and Çiçek, Abdullah Ercüment
- Subjects
Multitask learning ,mental disorders ,Intellectual disability ,Deep learning ,Comorbidity ,Autism spectrum disorder ,Graph convolutional networks - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2020. Includes bibliographical references (leaves 45-57). Autism Spectrum Disorder (ASD) and Intellectual Disability (ID) are comorbid neurodevelopmental disorders with complex genetic architectures. Despite largescale sequencing studies only a fraction of the risk genes were identified for both. Here, we present a novel network-based gene risk prioritization algorithm named DeepND that performs cross-disorder analysis to improve prediction power by exploiting the comorbidity of ASD and ID via multitask learning. Our model leverages information from gene co-expression networks that model human brain development using graph convolutional neural networks and learns which spatiotemporal neurodevelopmental windows are important for disorder etiologies. We show that our approach substantially improves the state-of-the-art prediction power. We observe that both disorders are enriched in transcription regulators. Despite tight regulatory links in between ASD risk genes, such is lacking across ASD and ID risk genes or within ID risk genes. Finally, we investigate frequent ASD and ID associated copy number variation regions and confident false findings to suggest several novel susceptibility gene candidates. DeepND can be generalized to analyze any combinations of comorbid disorders. by İlayda Beyreli M.S.
- Published
- 2020
43. Predicting informative spatio-temporal neurodevelopmental windows and gene risk for autism spectrum disorder
- Author
-
Karakahya, Oğuzhan and Çiçek, A. Ercüment
- Subjects
mental disorders ,Deep learning ,Autism spectrum disorder ,Graph convolutional networks - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2020. Includes bibliographical references (leaves 47-59). Autism Spectrum Disorder (ASD) is a complex neurodevelopmental disorder with a strong genetic basis. Due to its intricate nature, only a fraction of the risk genes were identified despite the effort spent on large-scale sequencing studies. To perceive underlying mechanisms of ASD and predict new risk genes, a deep learning architecture is designed which processes mutational burden of genes and gene co-expression networks using graph convolutional networks. In addition, a mixture of experts model is employed to detect specific neurodevelopmental periods that are of particular importance for the etiology of the disorder. This end-to-end trainable model produces a posterior ASD risk probability for each gene and learns the importance of each network for this prediction. The results of our approach show that the ASD gene risk prediction power is improved compared to the state-of-the-art models. We identify mediodorsal nucleus of thalamus and cerebellum brain region and neonatal & early infancy to middle & late childhood period (0 month - 12 years) as the most informative neurodevelopmental window for prediction. Top predicted risk genes are found to be highly enriched in ASDassociated pathways and transcription factor targets. We pinpoint several new candidate risk genes in CNV regions associated with ASD. We also investigate confident false-positives and false negatives of the method and point to studies which support the predictions of our method. by Oğuzhan Karakahya M.S.
- Published
- 2020
44. Heteronükleer tek kuantum uyumluluk spektroskopisi'ndeki karbon spektrumunun ameliyat esnasında eşzamanlı geridönüş için tahmin edilmesi
- Author
-
Karakaşlar, Emin Onur and Çiçek, A. Ercüment
- Subjects
HRMAS NMR ,Metabolomics ,HSQC NMR - Abstract
Cataloged from PDF version of article. Thesis (M.S.): Bilkent University, Department of Computer Engineering, İhsan Doğramacı Bilkent University, 2020. Includes bibliographical references (leaves 26-29). 1H High-Resolution Magic Angle Spinning (HRMAS) Nuclear Magnetic Resonance (NMR) is a reliable technology used for detecting metabolites in solid tissues. Fast response time enables guiding surgeons in real time, for detecting tumor cells that are left over in the excision cavity. However, severe overlap of spectral resonances in 1D signal often render distinguishing metabolites impossible. In that case, Heteronuclear Single Quantum Coherence Spectroscopy (HSQC) NMR is applied which can distinguish metabolites by generating 2D spectra (1H-13C). Unfortunately, this analysis requires much longer time and prohibits real time analysis. Thus, obtaining 2D spectrum fast has major implications in medicine. In this study, we show that using multiple multivariate regression and statistical total correlation spectroscopy, we can learn the relation between the 1H and 13C dimensions. Learning is possible with small sample sizes and without the need for performing the HSQC analysis, we can predict the 13C dimension by just performing 1H HRMAS NMR experiment. We show on a rat model of central nervous system tissues (80 samples, 5 tissues) that our methods achieve 0.971 and 0.957 mean R2 values, respectively. Our tests on 15 human brain tumor samples show that we can predict 104 groups of 39 metabolites with 97% accuracy. Finally, we show that we can predict the presence of a drug resistant tumor biomarker (creatine) despite obstructed signal in 1H dimension. In practice, this information can provide valuable feedback to the surgeon to further resect the cavity to avoid potential recurrence. by Emin Onur Karakaşlar M.S.
- Published
- 2020
45. Revisiting the complex architecture of ALS in Turkey: Expanding genotypes, shared phenotypes, molecular networks, and a public variant database
- Author
-
Mehmet Ali Akalin, Cemre Coşkun, Oznur Tastan, Tahsin Akgün, Ersin Tan, Aslihan Ozoguz Erimis, Mustafa Ertas, Halil Atilla Idrisoglu, Aysun Soysal, Erdi Şahin, Hamid Hamzeiy, Yesim Parman, Filiz Koç, Başar Bilgiç, Hasmet Hanagasi, Arman Çakar, Esra Gürsoy, Feza Deymeer, Ece Kartal, Fikret Aysal, Seyit Zor, Gulsen Babacan Yildiz, Nilda Turgut, Baris Isak, Gulden Olgun, Robin Palvadeau, Cemile Kocoglu, Fulya Akçimen, Tuncay Seker, Ersen Kavak, Elif Bayraktar, Utku Norman, A. Nazli Basak, A. Ercument Cicek, Ceren Tunca, Oguzhan Karakahya, Piraye Oflazer, Nesli-Ece Sen, Nurten Uzun Adatepe, Kayihan Uluc, Hacer Durmus, Cavit Boz, Dilcan Kotan, BABACAN YILDIZ, GÜLSEN, Tunca, Ceren, Seker, Tuncay, Akcimen, Fulya, Coskun, Cemre, Bayraktar, Elif, Palvadeau, Robin, Zor, Seyit, Kocoglu, Cemile, Kartal, Ece, Sen, Nesli Ece, Hamzeiy, Hamid, Erimis, Aslihan Ozoguz, Norman, Utku, Karakahya, Oguzhan, Olgun, Gulden, Akgun, Tahsin, Durmus, Hacer, Sahin, Erdi, Cakar, Arman, Gursoy, Esra Baar, Yildiz, Gulsen Babacan, Isak, Baris, Uluc, Kayihan, Hanagasi, Hasmet, Bilgic, Basar, Turgut, Nilda, Aysal, Fikret, Ertas, Mustafa, Boz, Cavit, Kotan, Dilcan, Idrisoglu, Halil, Soysal, Aysun, Adatepe, Nurten Uzun, Akalin, Mehmet Ali, Koc, Filiz, Tan, Ersin, Oflazer, Piraye, Deymeer, Feza, Tastan, Oznur, Cicek, A. Ercument, Kavak, Ersen, Parman, Yesim, Basak, A. Nazli, Karakahya, Oğuzhan, Olgun, Gülden, Çiçek, A. Ercüment, and İÜC, Cerrahpaşa Tıp Fakültesi, Dahili Tıp Bilimleri Bölümü
- Subjects
Turkey ,Genome-wide association study ,Gene mutation ,AMYOTROPHIC-LATERAL-SCLEROSIS ,ALS variant database ,Cell-Cycle Regulators ,Databases, Genetic ,MOTOR-NEURON DISEASE ,Coexpression Network ,genetics ,Genetics (clinical) ,Exome sequencing ,Genetics ,next generation sequencing ,RISK ,0303 health sciences ,education.field_of_study ,Project MinE ,030305 genetics & heredity ,SPINAL MUSCULAR-ATROPHY ,Amyotrophic-Lateral-Sclerosis ,clinical exome sequencing ,Penetrance ,3. Good health ,Phenotype ,Spinal Muscular-Atrophy ,Turkish peninsula ,motor neuron disease ,COEXPRESSION NETWORK ,GENE-MUTATIONS ,FORM ,Risk ,Genotype ,Population ,Locus (genetics) ,Biology ,03 medical and health sciences ,Gene-Mutations ,Sequence Variation ,Analyses Identify ,coexpression network analysis ,SEQUENCE VARIATION ,Humans ,Expanding genotypes, shared phenotypes, molecular networks, and a public variant database-, HUMAN MUTATION, cilt.41, 2020 [Tunca C., Seker T., Akcimen F., Coskun C., Bayraktar E., Palvadeau R., Zor S., Kocoglu C., Kartal E., Sen N. E. , et al., -Revisiting the complex architecture of ALS in Turkey] ,education ,Form ,030304 developmental biology ,Genetic association ,Internet ,genome-wide association study ,Whole Genome Sequencing ,ANALYSES IDENTIFY ,Amyotrophic Lateral Sclerosis ,Motor-Neuron Disease ,CELL-CYCLE REGULATORS ,ALS - Abstract
Olgun, Gulden/0000-0002-4467-1610; Sahin, Erdi/0000-0002-5792-2888; Tastan, Oznur/0000-0001-7058-5372; Akcimen, Fulya/0000-0003-0931-5247; Kartal, Ece/0000-0002-7720-455X WOS:000542467300001 PubMed ID: 32579787 The last decade has proven that amyotrophic lateral sclerosis (ALS) is clinically and genetically heterogeneous, and that the genetic component in sporadic cases might be stronger than expected. This study investigates 1,200 patients to revisit ALS in the ethnically heterogeneous yet inbred Turkish population. Familial ALS (fALS) accounts for 20% of our cases. The rates of consanguinity are 30% in fALS and 23% in sporadic ALS (sALS). Major ALS genes explained the disease cause in only 35% of fALS, as compared with similar to 70% in Europe and North America. Whole exome sequencing resulted in a discovery rate of 42% (53/127). Whole genome analyses in 623 sALS cases and 142 population controls, sequenced within Project MinE, revealed well-established fALS gene variants, solidifying the concept of incomplete penetrance in ALS. Genome-wide association studies (GWAS) with whole genome sequencing data did not indicate a new risk locus. Coupling GWAS with a coexpression network of disease-associated candidates, points to a significant enrichment for cell cycle- and division-related genes. Within this network, literature text-mining highlightsDECR1, ATL1, HDAC2, GEMIN4, andHNRNPA3as important genes. Finally, information on ALS-related gene variants in the Turkish cohort sequenced within Project MinE was compiled in the GeNDAL variant browser (www.gendal.org). TUBITAKTurkiye Bilimsel ve Teknolojik Arastirma Kurumu (TUBITAK) [109S075]; Bogazici University Research FundsBogazici University [15B01P1]; Suna and Inan Kirac Foundation [2005-2020] TUBITAK, Grant/Award Number: 109S075; Bogazici University Research Funds, Grant/Award Number: 15B01P1; Suna and Inan Kirac Foundation, Grant/Award Number: 2005-2020
- Published
- 2020
46. Robust Inference of Kinase Activity Using Functional Networks
- Author
-
Serhan Yılmaz, Marzieh Ayati, A. Ercument Cicek, Daniela Schlatzer, Mehmet Koyutürk, Mark R. Chance, and Çiçek, A. Ercüment
- Subjects
Proteomics ,0301 basic medicine ,Cellular signalling networks ,Computer science ,medicine.medical_treatment ,Science ,General Physics and Astronomy ,Inference ,Computational biology ,Mass Spectrometry ,Article ,General Biochemistry, Genetics and Molecular Biology ,Targeted therapy ,Functional networks ,03 medical and health sciences ,0302 clinical medicine ,Alzheimer Disease ,Neoplasms ,medicine ,Humans ,Computational models ,Gene Regulatory Networks ,Kinase activity ,Phosphorylation ,skin and connective tissue diseases ,Data mining ,Computational model ,Multidisciplinary ,Kinase ,Systems Biology ,Phosphotransferases ,Computational Biology ,Reproducibility of Results ,Parkinson Disease ,General Chemistry ,Phosphoproteins ,Identification (information) ,030104 developmental biology ,A kinase ,sense organs ,Algorithms ,Metabolic Networks and Pathways ,030217 neurology & neurosurgery ,Software ,Signal Transduction - Abstract
Mass spectrometry enables high-throughput screening of phosphoproteins across a broad range of biological contexts. When complemented by computational algorithms, phospho-proteomic data allows the inference of kinase activity, facilitating the identification of dysregulated kinases in various diseases including cancer, Alzheimer’s disease and Parkinson’s disease. To enhance the reliability of kinase activity inference, we present a network-based framework, RoKAI, that integrates various sources of functional information to capture coordinated changes in signaling. Through computational experiments, we show that phosphorylation of sites in the functional neighborhood of a kinase are significantly predictive of its activity. The incorporation of this knowledge in RoKAI consistently enhances the accuracy of kinase activity inference methods while making them more robust to missing annotations and quantifications. This enables the identification of understudied kinases and will likely lead to the development of novel kinase inhibitors for targeted therapy of many diseases. RoKAI is available as web-based tool at http://rokai.io., Kinases drive fundamental changes in cell state, but predicting kinase activity based on substrate-level changes can be challenging. Here the authors introduce a computational framework that utilizes similarities between substrates to robustly infer kinase activity.
- Published
- 2020
- Full Text
- View/download PDF
47. The Effect of Kinship in Re-identification Attacks Against Genomic Data Sharing Beacons
- Author
-
Erman Ayday, A. Ercument Cicek, Miray Aysen, Kerem Ayoz, Ayoz, Kerem, Ayşen, Miray, Ayday, Erman, and Çiçek, A. Ercüment
- Subjects
Statistics and Probability ,animal structures ,Computer science ,Interface (computing) ,0206 medical engineering ,Big data ,Inference ,02 engineering and technology ,Computer security ,computer.software_genre ,Biochemistry ,03 medical and health sciences ,fluids and secretions ,parasitic diseases ,Kinship ,Humans ,Family ,International HapMap Project ,Molecular Biology ,Protocol (object-oriented programming) ,030304 developmental biology ,0303 health sciences ,Data ,Information Dissemination ,business.industry ,Grandparent ,Genomics ,Computer Science Applications ,Beacon ,Computational Mathematics ,Phenotype ,Computational Theory and Mathematics ,business ,computer ,020602 bioinformatics - Abstract
Motivation Big data era in genomics promises a breakthrough in medicine, but sharing data in a private manner limit the pace of field. Widely accepted ‘genomic data sharing beacon’ protocol provides a standardized and secure interface for querying the genomic datasets. The data are only shared if the desired information (e.g. a certain variant) exists in the dataset. Various studies showed that beacons are vulnerable to re-identification (or membership inference) attacks. As beacons are generally associated with sensitive phenotype information, re-identification creates a significant risk for the participants. Unfortunately, proposed countermeasures against such attacks have failed to be effective, as they do not consider the utility of beacon protocol. Results In this study, for the first time, we analyze the mitigation effect of the kinship relationships among beacon participants against re-identification attacks. We argue that having multiple family members in a beacon can garble the information for attacks since a substantial number of variants are shared among kin-related people. Using family genomes from HapMap and synthetically generated datasets, we show that having one of the parents of a victim in the beacon causes (i) significant decrease in the power of attacks and (ii) substantial increase in the number of queries needed to confirm an individual’s beacon membership. We also show how the protection effect attenuates when more distant relatives, such as grandparents are included alongside the victim. Furthermore, we quantify the utility loss due adding relatives and show that it is smaller compared with flipping based techniques.
- Published
- 2020
48. Matrix Metalloproteinase-11 Promotes Early Mouse Mammary Gland Tumor Growth through Metabolic Reprogramming and Increased IGF1/AKT/FoxO1 Signaling Pathway, Enhanced ER Stress and Alteration in Mitochondrial UPR
- Author
-
M. P. Chenard, Hassiba Outilaft, Catherine Tomasetto, Fabien Alpy, Bing Tan, Izzie Jacques Namer, Corinne Wendling, A. Ercument Cicek, Nassim Dali-Youcef, Amélie Jaulin, Caroline Bund, Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Hôpital de Hautepierre [Strasbourg], Laboratoire des sciences de l'ingénieur, de l'informatique et de l'imagerie (ICube), École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Université de Strasbourg (UNISTRA)-Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Institut National de Recherche en Informatique et en Automatique (Inria)-Les Hôpitaux Universitaires de Strasbourg (HUS)-Centre National de la Recherche Scientifique (CNRS)-Matériaux et Nanosciences Grand-Est (MNGE), Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Réseau nanophotonique et optique, Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS), Institut de Cancérologie de Strasbourg Europe (ICANS), Bilkent University [Ankara], Carnegie Mellon University [Pittsburgh] (CMU), ALPY, Fabien, Çiçek, A. Ercüment, Institut National des Sciences Appliquées - Strasbourg (INSA Strasbourg), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS)-École Nationale du Génie de l'Eau et de l'Environnement de Strasbourg (ENGEES)-Réseau nanophotonique et optique, Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Matériaux et nanosciences d'Alsace (FMNGE), Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Université de Haute-Alsace (UHA) Mulhouse - Colmar (Université de Haute-Alsace (UHA))-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS)-Institut de Chimie du CNRS (INC)-Université de Strasbourg (UNISTRA)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Centre National de la Recherche Scientifique (CNRS), Les Hôpitaux Universitaires de Strasbourg (HUS), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-Matériaux et nanosciences d'Alsace (FMNGE), and Nouvel Hôpital Civil de Strasbourg
- Subjects
0301 basic medicine ,Cancer Research ,[SDV]Life Sciences [q-bio] ,FOXO1 ,[SDV.CAN]Life Sciences [q-bio]/Cancer ,medicine.disease_cause ,lcsh:RC254-282 ,Article ,03 medical and health sciences ,metabolic flexibility ,0302 clinical medicine ,Breast cancer ,Metabolic flexibility ,breast cancer ,[SDV.CAN] Life Sciences [q-bio]/Cancer ,Mammary tumor virus ,UPRmt ,Mitochondrial unfolded protein response ,[SDV.BBM.GTP]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN] ,medicine ,Metabolomics ,[SDV.BBM]Life Sciences [q-bio]/Biochemistry, Molecular Biology ,UPRER ,Protein kinase B ,Chemistry ,[SDV.BA]Life Sciences [q-bio]/Animal biology ,Proteolytic enzymes ,lcsh:Neoplasms. Tumors. Oncology. Including cancer and carcinogens ,metabolomics ,Cell biology ,030104 developmental biology ,Oncology ,030220 oncology & carcinogenesis ,Unfolded protein response ,[SDV.BBM.GTP] Life Sciences [q-bio]/Biochemistry, Molecular Biology/Genomics [q-bio.GN] ,Warburg effect ,Signal transduction ,Carcinogenesis - Abstract
International audience; Matrix metalloproteinase 11 (MMP11) is an extracellular proteolytic enzyme belonging to the matrix metalloproteinase (MMP11) family. These proteases are involved in extracellular matrix (ECM) remodeling and activation of latent factors. MMP11 is a negative regulator of adipose tissue development and controls energy metabolism in vivo. In cancer, MMP11 expression is associated with poorer survival, and preclinical studies in mice showed that MMP11 accelerates tumor growth. How the metabolic role of MMP11 contributes to cancer development is poorly understood. To address this issue, we developed a series of preclinical mouse mammary gland tumor models by genetic engineering. Tumor growth was studied in mice either deficient (Loss of Function-LOF) or overexpressing MMP11 (Gain of Function-GOF) crossed with a transgenic model of breast cancer induced by the polyoma middle T antigen (PyMT) driven by the murine mammary tumor virus promoter (MMTV) (MMTV-PyMT). Both GOF and LOF models support roles for MMP11, favoring early tumor growth by increasing proliferation and reducing apoptosis. Of interest, MMP11 promotes Insulin-like Growth Factor-1 (IGF1)/protein kinase B (AKT)/Forkhead box protein O1 (FoxO1) signaling and is associated with a metabolic switch in the tumor, activation of the endoplasmic reticulum stress response, and an alteration in the mitochondrial unfolded protein response with decreased proteasome activity. In addition, high resonance magic angle spinning (HRMAS) metabolomics analysis of tumors from both models established a metabolic signature that favors tumorigenesis when MMP11 is overexpressed. These data support the idea that MMP11 contributes to an adaptive metabolic response, named metabolic flexibility, promoting cancer growth.
- Published
- 2020
49. Apollo: A Sequencing-Technology-Independent, Scalable And Accurate Assembly Polishing Algorithm
- Author
-
Damla Senol Cali, A. Ercument Cicek, Jeremie S. Kim, Can Alkan, Can Firtina, Onur Mutlu, Mohammed Alser, Çiçek, A. Ercüment, Alkan, Can, and Mutlu, Onur
- Subjects
FOS: Computer and information sciences ,Statistics and Probability ,Technology ,Computer Science - Machine Learning ,Source code ,Computer science ,Base pair ,media_common.quotation_subject ,Polishing ,Biochemistry ,Genome ,Machine Learning (cs.LG) ,Computational Engineering, Finance, and Science (cs.CE) ,03 medical and health sciences ,0302 clinical medicine ,Quantitative Biology - Genomics ,Computer Science - Computational Engineering, Finance, and Science ,Molecular Biology ,030304 developmental biology ,media_common ,Genomics (q-bio.GN) ,0303 health sciences ,High-Throughput Nucleotide Sequencing ,Sequence Analysis, DNA ,Construct (python library) ,Computer Science Applications ,Computational Mathematics ,Computational Theory and Mathematics ,FOS: Biological sciences ,Scalability ,Poland ,Algorithm ,Algorithms ,Software ,030217 neurology & neurosurgery - Abstract
Long reads produced by third-generation sequencing technologies are used to construct an assembly (i.e., the subject's genome), which is further used in downstream genome analysis. Unfortunately, long reads have high sequencing error rates and a large proportion of bps in these long reads are incorrectly identified. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize such error propagation by polishing or fixing errors in the assembly by using information from alignments between reads and the assembly (i.e., read-to-assembly alignment information). However, assembly polishing algorithms can only polish an assembly using reads either from a certain sequencing technology or from a small assembly. Such technology-dependency and assembly-size dependency require researchers to 1) run multiple polishing algorithms and 2) use small chunks of a large genome to use all available read sets and polish large genomes. We introduce Apollo, a universal assembly polishing algorithm that scales well to polish an assembly of any size (i.e., both large and small genomes) using reads from all sequencing technologies (i.e., second- and third-generation). Our goal is to provide a single algorithm that uses read sets from all available sequencing technologies to improve the accuracy of assembly polishing and that can polish large genomes. Apollo 1) models an assembly as a profile hidden Markov model (pHMM), 2) uses read-to-assembly alignment to train the pHMM with the Forward-Backward algorithm, and 3) decodes the trained model with the Viterbi algorithm to produce a polished assembly. Our experiments with real read sets demonstrate that Apollo is the only algorithm that 1) uses reads from any sequencing technology within a single run and 2) scales well to polish large assemblies without splitting the assembly into multiple parts., Comment: 9 pages, 1 figure. Accepted in Bioinformatics
- Published
- 2020
50. Potpourri: an epistasis test prioritization algorithm via diverse SNP selection
- Author
-
A. Ercument Cicek, Gizem Caylak, Schwartz, R., Çaylak, Gizem, and Çiçek, A. Ercüment
- Subjects
0303 health sciences ,Computer science ,Genetic traits ,0206 medical engineering ,Genome-wide association study ,02 engineering and technology ,Computational biology ,03 medical and health sciences ,Test prioritization ,SNP ,Epistasis ,Gene ,020602 bioinformatics ,Selection (genetic algorithm) ,030304 developmental biology ,Genetic association - Abstract
Date of Conference: 10-13 May 2020 Conference Name: 24th Annual Conference on Research in Computational Molecular Biology, RECOMB 2020 Genome-wide association studies explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help closing this gap. Unfortunately, sheer number of loci combinations to process and hypotheses to test prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely-epistatic SNP pairs to limit the number of tests. Yet, they still su_er from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location, leads to better phenotype prediction due to genetic complementation. Here, we hypothesize that an algorithm that pairs SNPs from such diverse regions and carefully ranks the pairs can detect statistically more meaningful pairs and can improve prediction power. We propose an epistasis test prioritization algorithm which optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state- of-the-art on three GWAS and show that (i) we substantially improve precision (from 0.003 to 0.652) while maintaining the signi_cance of selected pairs, (ii) decrease the number of tests by 25 folds, and (iii) decrease the runtime by 4 folds. We also show that promoting SNPs from regulatory/coding regions improves the precision (up to 0.8).
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.