Author: "Markus Heinonen" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Markus Heinonen"' showing total 67 results

Start Over Author "Markus Heinonen"

67 results on '"Markus Heinonen"'

1. Human-in-the-loop assisted de novo molecular design

Author: Iiris Sundin, Alexey Voronov, Haoping Xiao, Kostas Papadopoulos, Esben Jannik Bjerrum, Markus Heinonen, Atanas Patronov, Samuel Kaski, and Ola Engkvist
Subjects: Interactive algorithms, De novo molecular design, Human-in-the-loop, AI-assisted design, Goal-oriented molecule generation, Expert knowledge elicitation, Information technology, T58.5-58.64, Chemistry, QD1-999
Abstract: Abstract A de novo molecular design workflow can be used together with technologies such as reinforcement learning to navigate the chemical space. A bottleneck in the workflow that remains to be solved is how to integrate human feedback in the exploration of the chemical space to optimize molecules. A human drug designer still needs to design the goal, expressed as a scoring function for the molecules that captures the designer’s implicit knowledge about the optimization task. Little support for this task exists and, consequently, a chemist usually resorts to iteratively building the objective function of multi-parameter optimization (MPO) in de novo design. We propose a principled approach to use human-in-the-loop machine learning to help the chemist to adapt the MPO scoring function to better match their goal. An advantage is that the method can learn the scoring function directly from the user’s feedback while they browse the output of the molecule generator, instead of the current manual tuning of the scoring function with trial and error. The proposed method uses a probabilistic model that captures the user’s idea and uncertainty about the scoring function, and it uses active learning to interact with the user. We present two case studies for this: In the first use-case, the parameters of an MPO are learned, and in the second use-case a non-parametric component of the scoring function to capture human domain knowledge is developed. The results show the effectiveness of the methods in two simulated example cases with an oracle, achieving significant improvement in less than 200 feedback queries, for the goals of a high QED score and identifying potent molecules for the DRD2 receptor, respectively. We further demonstrate the performance gains with a medicinal chemist interacting with the system. Graphical Abstract
Published: 2022
Full Text: View/download PDF

2. S176: FUNCTIONAL AND ANTIGEN-SPECIFIC CHARACTERIZATION OF IMMUNE CELLS AT THE SINGLE-CELL LEVEL REVEALS CONVERGENCE OF ADAPTIVE AND INNATE IMMUNITY IN IMMUNE APLASTIC ANEMIA

Author: Jani Huuhtanen, Sofie Lundgren, Mikko Keränen, Xingim Feng, Alina Dulau-Florea, Bhavisha Patel, Yoshitaka Zaimoku, Cassandra Kerr, Emmi Jokinen, Markus Heinonen, Hanna Rajala, Sanna Siitonen, Freja Ebeling, Georgina Ryland, Lucy Fox, Piers Blombery, Eva Hellström-Lindberg, Jaroslaw P. Maciejewski, Neal S. Young, Harri Lähdesmäki, and Satu Mustjoki
Subjects: Diseases of the blood and blood-forming organs, RC633-647.5
Published: 2023
Full Text: View/download PDF

3. Modeling binding specificities of transcription factor pairs with random forests

Author: Anni A. Antikainen, Markus Heinonen, and Harri Lähdesmäki
Subjects: Transcription factor pair, Random forest, DNA binding site, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Transcription factors (TFs) bind regulatory DNA regions with sequence specificity, form complexes and regulate gene expression. In cooperative TF-TF binding, two transcription factors bind onto a shared DNA binding site as a pair. Previous work has demonstrated pairwise TF-TF-DNA interactions with position weight matrices (PWMs), which may however not sufficiently take into account the complexity and flexibility of pairwise binding. Results We propose two random forest (RF) methods for joint TF-TF binding site prediction: ComBind and JointRF. We train models with previously published large-scale CAP-SELEX DNA libraries, which comprise DNA sequences enriched for binding of a selected TF pair. JointRF builds a random forest with sub-sequences selected from CAP-SELEX DNA reads with previously proposed pairwise PWM. JointRF outperforms (area under receiver operating characteristics curve, AUROC, 0.75) the current state-of-the-art method i.e. orientation and spacing specific pairwise PWMs (AUROC 0.59). Thus, JointRF may be utilized to improve prediction accuracy for pre-determined binding preferences. However, pairwise TF binding is currently considered flexible; a pair may bind DNA with different orientations and amounts of dinucleotide gaps or overlap between the two motifs. Thus, we developed ComBind, which utilizes random forests by considering simultaneously multiple orientations and spacings of the two factors. Our approach outperforms (AUROC 0.78) PWMs, as well as JointRF (p
Published: 2022
Full Text: View/download PDF

4. Predicting recognition between T cell receptors and epitopes with TCRGP.

Author: Emmi Jokinen, Jani Huuhtanen, Satu Mustjoki, Markus Heinonen, and Harri Lähdesmäki
Subjects: Biology (General), QH301-705.5
Abstract: Adaptive immune system uses T cell receptors (TCRs) to recognize pathogens and to consequently initiate immune responses. TCRs can be sequenced from individuals and methods analyzing the specificity of the TCRs can help us better understand individuals' immune status in different disorders. For this task, we have developed TCRGP, a novel Gaussian process method that predicts if TCRs recognize specified epitopes. TCRGP can utilize the amino acid sequences of the complementarity determining regions (CDRs) from TCRα and TCRβ chains and learn which CDRs are important in recognizing different epitopes. Our comprehensive evaluation with epitope-specific TCR sequencing data shows that TCRGP achieves on average higher prediction accuracy in terms of AUROC score than existing state-of-the-art methods in epitope-specificity predictions. We also propose a novel analysis approach for combined single-cell RNA and TCRαβ (scRNA+TCRαβ) sequencing data by quantifying epitope-specific TCRs with TCRGP and identify HBV-epitope specific T cells and their transcriptomic states in hepatocellular carcinoma patients.
Published: 2021
Full Text: View/download PDF

5. Temporal clustering analysis of endothelial cell gene expression following exposure to a conventional radiotherapy dose fraction using Gaussian process clustering.

Author: Markus Heinonen, Fabien Milliat, Mohamed Amine Benadjaoud, Agnès François, Valérie Buard, Georges Tarlet, Florence d'Alché-Buc, and Olivier Guipaud
Subjects: Medicine, Science
Abstract: The vascular endothelium is considered as a key cell compartment for the response to ionizing radiation of normal tissues and tumors, and as a promising target to improve the differential effect of radiotherapy in the future. Following radiation exposure, the global endothelial cell response covers a wide range of gene, miRNA, protein and metabolite expression modifications. Changes occur at the transcriptional, translational and post-translational levels and impact cell phenotype as well as the microenvironment by the production and secretion of soluble factors such as reactive oxygen species, chemokines, cytokines and growth factors. These radiation-induced dynamic modifications of molecular networks may control the endothelial cell phenotype and govern recruitment of immune cells, stressing the importance of clearly understanding the mechanisms which underlie these temporal processes. A wide variety of time series data is commonly used in bioinformatics studies, including gene expression, protein concentrations and metabolomics data. The use of clustering of these data is still an unclear problem. Here, we introduce kernels between Gaussian processes modeling time series, and subsequently introduce a spectral clustering algorithm. We apply the methods to the study of human primary endothelial cells (HUVECs) exposed to a radiotherapy dose fraction (2 Gy). Time windows of differential expressions of 301 genes involved in key cellular processes such as angiogenesis, inflammation, apoptosis, immune response and protein kinase were determined from 12 hours to 3 weeks post-irradiation. Then, 43 temporal clusters corresponding to profiles of similar expressions, including 49 genes out of 301 initially measured, were generated according to the proposed method. Forty-seven transcription factors (TFs) responsible for the expression of clusters of genes were predicted from sequence regulatory elements using the MotifMap system. Their temporal profiles of occurrences were established and clustered. Dynamic network interactions and molecular pathways of TFs and differential genes were finally explored, revealing key node genes and putative important cellular processes involved in tissue infiltration by immune cells following exposure to a radiotherapy dose fraction.
Published: 2018
Full Text: View/download PDF

6. Metabolite Identification through Machine Learning— Tackling CASMI Challenge Using FingerID

Author: Huibin Shen, Nicola Zamboni, Markus Heinonen, and Juho Rousu
Subjects: metabolite identification, molecular fingerprints, machine learning, FingerID, Microbiology, QR1-502
Abstract: Metabolite identification is a major bottleneck in metabolomics due to the number and diversity of the molecules. To alleviate this bottleneck, computational methods and tools that reliably filter the set of candidates are needed for further analysis by human experts. Recent efforts in assembling large public mass spectral databases such as MassBank have opened the door for developing a new genre of metabolite identification methods that rely on machine learning as the primary vehicle for identification. In this paper we describe the machine learning approach used in FingerID, its application to the CASMI challenges and some results that were not part of our challenge submission. In short, FingerID learns to predict molecular fingerprints from a large collection of MS/MS spectra, and uses the predicted fingerprints to retrieve and rank candidate molecules from a given large molecular database. Furthermore, we introduce a web server for FingerID, which was applied for the first time to the CASMI challenges. The challenge results show that the new machine learning framework produces competitive results on those challenge molecules that were found within the relatively restricted KEGG compound database. Additional experiments on the PubChem database confirm the feasibility of the approach even on a much larger database, although room for improvement still remains.
Published: 2013
Full Text: View/download PDF

7. Towards Interpretable Models of Chemist Preferences for Human-in-the-Loop Assisted Drug Discovery.

Author: Yasmine Nahal, Markus Heinonen, Mikhail Kabeshov, Jon Paul Janet, Eva Nittinger, Ola Engkvist, and Samuel Kaski
Published: 2024
Full Text: View/download PDF

8. Balancing Imbalanced Toxicity Models: Using MolBERT with Focal Loss.

Author: Muhammad Arslan Masood, Samuel Kaski, Hugo Ceulemans, Dorota Herman, and Markus Heinonen
Published: 2024
Full Text: View/download PDF

9. ClimODE: Climate and Weather Forecasting with Physics-informed Neural ODEs.

Author: Yogesh Verma, Markus Heinonen, and Vikas Garg 0001
Published: 2024

10. Input-gradient space particle inference for neural network ensembles.

Author: Trung Q. Trinh, Markus Heinonen, Luigi Acerbi, and Samuel Kaski
Published: 2024

11. Uncertainty-Aware Natural Language Inference with Stochastic Weight Averaging.

Author: Aarne Talman, Hande çelikkanat, Sami Virpioja, Markus Heinonen, and Jörg Tiedemann
Published: 2023

12. Incorporating functional summary information in Bayesian neural networks using a Dirichlet process likelihood approach.

Author: Vishnu Raj, Tianyu Cui, Markus Heinonen, and Pekka Marttinen
Published: 2023

13. AbODE: Ab initio antibody design using conjoined ODEs.

Author: Yogesh Verma, Markus Heinonen, and Vikas Garg 0001
Published: 2023

14. Variational multiple shooting for Bayesian ODEs with Gaussian processes.

Author: Pashupati Hegde, çagatay Yildiz, Harri Lähdesmäki, Samuel Kaski, and Markus Heinonen
Published: 2022

15. Tackling covariate shift with node-based Bayesian neural networks.

Author: Trung Q. Trinh, Markus Heinonen, Luigi Acerbi, and Samuel Kaski
Published: 2022

16. Latent Neural ODEs with Sparse Bayesian Multiple Shooting.

Author: Valerii Iakovlev, çagatay Yildiz, Markus Heinonen, and Harri Lähdesmäki
Published: 2023

17. Generative Modelling with Inverse Heat Dissipation.

Author: Severi Rissanen, Markus Heinonen, and Arno Solin
Published: 2023

18. Learning Space-Time Continuous Latent Neural PDEs from Partially Observed States.

Author: Valerii Iakovlev, Markus Heinonen, and Harri Lähdesmäki
Published: 2023

19. Continuous-Time Functional Diffusion Processes.

Author: Giulio Franzese, Giulio Corallo, Simone Rossi, Markus Heinonen, Maurizio Filippone, and Pietro Michiardi
Published: 2023

20. De-randomizing MCMC dynamics with the diffusion Stein operator.

Author: Zheyang Shen, Markus Heinonen, and Samuel Kaski
Published: 2021

21. Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations.

Author: Simone Rossi, Markus Heinonen, Edwin V. Bonilla, Zheyang Shen, and Maurizio Filippone
Published: 2021

22. Bayesian Inference for Optimal Transport with Stochastic Cost.

Author: Anton Mallasto, Markus Heinonen, and Samuel Kaski
Published: 2021

23. Continuous-time Model-based Reinforcement Learning.

Author: çagatay Yildiz, Markus Heinonen, and Harri Lähdesmäki
Published: 2021

24. Learning spectrograms with convolutional spectral kernels.

Author: Zheyang Shen, Markus Heinonen, and Samuel Kaski
Published: 2020

25. Modular Flows: Differential Molecular Generation.

Author: Yogesh Verma, Samuel Kaski, Markus Heinonen, and Vikas Garg 0001
Published: 2022

26. ODE2VAE: Deep generative second order ODEs with Bayesian neural networks.

Author: çagatay Yildiz, Markus Heinonen, and Harri Lähdesmäki
Published: 2019

27. Harmonizable mixture kernels with variational Fourier features.

Author: Zheyang Shen, Markus Heinonen, and Samuel Kaski
Published: 2019

28. Deep learning with differential Gaussian process flows.

Author: Pashupati Hegde, Markus Heinonen, Harri Lähdesmäki, and Samuel Kaski
Published: 2019

29. Deep Convolutional Gaussian Processes.

Author: Kenneth Blomqvist, Samuel Kaski, and Markus Heinonen
Published: 2019
Full Text: View/download PDF

30. Learning continuous-time PDEs from sparse data with graph neural networks.

Author: Valerii Iakovlev, Markus Heinonen, and Harri Lähdesmäki
Published: 2021

31. Learning stochastic differential equations with Gaussian Processes without Gradient Matching.

Author: çagatay Yildiz, Markus Heinonen, Jukka Intosalmi, Henrik Mannerström, and Harri Lähdesmäki
Published: 2018
Full Text: View/download PDF

32. Variational zero-inflated Gaussian processes with sparse kernels.

Author: Pashupati Hegde, Markus Heinonen, and Samuel Kaski
Published: 2018

33. Learning unknown ODE models with Gaussian processes.

Author: Markus Heinonen, çagatay Yildiz, Henrik Mannerström, Jukka Intosalmi, and Harri Lähdesmäki
Published: 2018

34. Non-Stationary Spectral Kernels.

Author: Sami Remes, Markus Heinonen, and Samuel Kaski
Published: 2017

35. A Mutually-Dependent Hadamard Kernel for Modelling Latent Variable Couplings.

Author: Sami Remes, Markus Heinonen, and Samuel Kaski
Published: 2017

36. Non-Stationary Gaussian Process Regression with Hamiltonian Monte Carlo.

Author: Markus Heinonen, Henrik Mannerström, Juho Rousu, Samuel Kaski, and Harri Lähdesmäki
Published: 2016

37. Random Fourier Features For Operator-Valued Kernels.

Author: Romain Brault, Markus Heinonen, and Florence d'Alché-Buc
Published: 2016

38. Differential Equations and Continuous-Time Deep Learning (Dagstuhl Seminar 22332)

Author: David Duvenaud and Markus Heinonen and Michael Tiemann and Max Welling, Duvenaud, David, Heinonen, Markus, Tiemann, Michael, Welling, Max, David Duvenaud and Markus Heinonen and Michael Tiemann and Max Welling, Duvenaud, David, Heinonen, Markus, Tiemann, Michael, and Welling, Max
Abstract: This report documents the program and the outcomes of Dagstuhl Seminar 22332 "Differential Equations and Continuous-Time Deep Learning". Neural ordinary-differential equations and similar continuous model architectures have gained interest in recent years, due to the existence of a vast literature in calculus and numerical analysis. Thus, continuous models might lead to architectures with finer control over prior assumptions or theoretical understanding. In this seminar, we have sought to bring together researchers from traditionally disjoint areas - machine learning, numerical analysis, dynamical systems and their "consumers" - to try and develop a joint language about this novel modeling paradigm. Through talks & group discussions, we have identified common interests and we hope that this first seminar is but the first step on a joint journey.
Published: 2023
Full Text: View/download PDF

39. TCRconv: predicting recognition between T cell receptors and epitopes using contextualized motifs

Author: Emmi Jokinen, Alexandru Dumitrescu, Jani Huuhtanen, Vladimir Gligorijević, Satu Mustjoki, Richard Bonneau, Markus Heinonen, Harri Lähdesmäki, Professorship Lähdesmäki Harri, Department of Computer Science, University of Helsinki, Simons Foundation, Flatiron Institute, Probabilistic Machine Learning, Computer Science Professors, Aalto-yliopisto, Aalto University, Biotekniikan instituutti, Tietojenkäsittelytieteen osasto, Helsingin yliopisto, Helsinki Institute of Life Science HiLIFE, TRIMM - Translational Immunology Research Program, HUS Syöpäkeskus, Hematologian yksikkö, Kliinisen kemian ja hematologian osasto, Clinicum, and Digital Precision Cancer Medicine (iCAN)
Subjects: Statistics and Probability, Computational Mathematics, Computational Theory and Mathematics, 1182 Biokemia, solu- ja molekyylibiologia, 11832 Mikrobiologia ja virologia, Molecular Biology, Biochemistry, Computer Science Applications
Abstract: Motivation T cells use T cell receptors (TCRs) to recognize small parts of antigens, called epitopes, presented by major histocompatibility complexes. Once an epitope is recognized, an immune response is initiated and T cell activation and proliferation by clonal expansion begin. Clonal populations of T cells with identical TCRs can remain in the body for years, thus forming immunological memory and potentially mappable immunological signatures, which could have implications in clinical applications including infectious diseases, autoimmunity and tumor immunology. Results We introduce TCRconv, a deep learning model for predicting recognition between TCRs and epitopes. TCRconv uses a deep protein language model and convolutions to extract contextualized motifs and provides state-of-the-art TCR-epitope prediction accuracy. Using TCR repertoires from COVID-19 patients, we demonstrate that TCRconv can provide insight into T cell dynamics and phenotypes during the disease. Availability and implementation TCRconv is available at https://github.com/emmijokinen/tcrconv. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2023

40. Efficient Path Kernels for Reaction Function Prediction.

Author: Markus Heinonen, Niko Välimäki, Veli Mäkinen, and Juho Rousu
Published: 2012

41. Prediction and impact of personalized donation intervals

Author: Femmeke J. Prinsze, Jarkko Toivonen, Markus Heinonen, Esa Turkulainen, Pietro Della Briotta Parolo, Yrjö Koski, Mikko Arvas, Department of Computer Science, Institute for Molecular Medicine Finland, Genomics of Neurological and Neuropsychiatric Disorders, and Kernel Machines, Pattern Analysis and Computational Biology research group / Juho Rousu
Subjects: donor health, Blood Donors, 030204 cardiovascular system & hematology, Hemoglobins, 03 medical and health sciences, 0302 clinical medicine, Humans, Cutoff, Medicine, Deferral, Actual use, haemoglobin measurement, Hematologic Tests, business.industry, blood collection, Anemia, Hematology, General Medicine, Blood collection, Hematologic Diseases, Biobank, 3. Good health, 3121 General medicine, internal medicine and other clinical medicine, Donation, Low haemoglobin, Cohort, 1182 Biochemistry, cell and molecular biology, Female, business, 030215 immunology, Demography
Abstract: Publisher Copyright: © 2021 The Authors. Vox Sanguinis published by John Wiley & Sons Ltd on behalf of International Society of Blood Transfusion. Background and Objectives: Deferral of blood donors due to low haemoglobin (Hb) is demotivating to donors, can be a sign for developing anaemia and incurs costs for blood establishments. The prediction of Hb deferral has been shown to be feasible in a number of studies based on demographic, Hb measurement and donation history data. The aim of this paper is to evaluate how state-of-the-art computational prediction tools can facilitate nationwide personalized donation intervals. Materials and Methods: Using donation history data from the last 20 years in Finland, FinDonor blood donor cohort data and blood service Biobank genotyping data, we built linear and non-linear predictors of Hb deferral. Based on financial data from the Finnish Red Cross Blood Service, we then estimated the economic impacts of deploying such predictors. Results: We discovered that while linear predictors generally predict Hb relatively well, they have difficulties in predicting low Hb values. Overall, we found that non-linear or linear predictors with or without genetic data performed only slightly better than a simple cutoff based on previous Hb. However, if any of our deferral prediction methods are used to assign temporary prolongations of donation intervals for females, then our calculations indicate cost savings while maintaining the blood supply. Conclusion: We find that even though the prediction accuracy is not very high, the actual use of any of our predictors in blood collection is still likely to bring benefits to blood donors and blood establishments alike.
Published: 2021

42. Structured Output Prediction of Anti-cancer Drug Activity.

Author: Hongyu Su, Markus Heinonen, and Juho Rousu
Published: 2010
Full Text: View/download PDF

43. Ab Initio Prediction of Molecular Fragments from Tandem Mass Spectrometry Data.

Author: Markus Heinonen, Ari Rantanen, Taneli Mielikäinen, Esa Pitkänen, Juha Kokkonen, and Juho Rousu
Published: 2006

44. Predicting recognition between T cell receptors and epitopes using contextualized motifs

Author: Emmi Jokinen, Alexandru Dumitrescu, Jani Huuhtanen, Vladimir Gligorijević, Satu Mustjoki, Richard Bonneau, Markus Heinonen, and Harri Lähdesmäki
Abstract: We introduce TCRconv, a deep learning model for predicting recognition between T-cell receptors and epitopes. TCRconv uses a deep protein language model and convolutions to extract contextualized motifs and provides state-of-the-art TCR-epitope prediction accuracy. Using TCR repertoires from COVID-19 patients, we demonstrate that TCRconv can provide insight into T-cell dynamics and phenotypes during the disease.
Published: 2022

45. Substrate specificity of 2-deoxy-D-ribose 5-phosphate aldolase (DERA) assessed by different protein engineering and machine learning methods

Author: Emmi Jokinen, Sanni Voutilainen, Markus Heinonen, Hannu Maaheimo, Anu Koivula, Martina Andberg, Juho Rousu, Juha Rouvinen, Merja Penttilä, Harri Lähdesmäki, Samuel Kaski, Johan Pääkkönen, Nina Hakulinen, VTT Technical Research Centre of Finland, Centre of Excellence in Molecular Systems Immunology and Physiology Research Group, SyMMys, University of Eastern Finland, Helsinki Institute for Information Technology (HIIT), Department of Computer Science, Aalto-yliopisto, and Aalto University
Subjects: DERA, Protein Engineering, Crystal structure determination, Machine learning, computer.software_genre, medicine.disease_cause, 01 natural sciences, Applied Microbiology and Biotechnology, Aldehyde, Substrate Specificity, Machine Learning, 03 medical and health sciences, Aldol reaction, Fructose-Bisphosphate Aldolase, Escherichia coli, Aldolase, medicine, Biotechnologically Relevant Enzymes and Proteins, Aldehyde-Lyases, 030304 developmental biology, chemistry.chemical_classification, 0303 health sciences, C–C bond formation, biology, 010405 organic chemistry, Chemistry, business.industry, Aldolase A, Active site, General Medicine, Protein engineering, 0104 chemical sciences, Amino acid, Enzyme, Biocatalysis, biology.protein, Artificial intelligence, business, computer, Biotechnology
Abstract: Abstract In this work, deoxyribose-5-phosphate aldolase (Ec DERA, EC 4.1.2.4) from Escherichia coli was chosen as the protein engineering target for improving the substrate preference towards smaller, non-phosphorylated aldehyde donor substrates, in particular towards acetaldehyde. The initial broad set of mutations was directed to 24 amino acid positions in the active site or in the close vicinity, based on the 3D complex structure of the E. coli DERA wild-type aldolase. The specific activity of the DERA variants containing one to three amino acid mutations was characterised using three different substrates. A novel machine learning (ML) model utilising Gaussian processes and feature learning was applied for the 3rd mutagenesis round to predict new beneficial mutant combinations. This led to the most clear-cut (two- to threefold) improvement in acetaldehyde (C2) addition capability with the concomitant abolishment of the activity towards the natural donor molecule glyceraldehyde-3-phosphate (C3P) as well as the non-phosphorylated equivalent (C3). The Ec DERA variants were also tested on aldol reaction utilising formaldehyde (C1) as the donor. Ec DERA wild-type was shown to be able to carry out this reaction, and furthermore, some of the improved variants on acetaldehyde addition reaction turned out to have also improved activity on formaldehyde. Key points • DERA aldolases are promiscuous enzymes. • Synthetic utility of DERA aldolase was improved by protein engineering approaches. • Machine learning methods aid the protein engineering of DERA.
Published: 2020

46. Evolving-Graph Gaussian Processes

Author: David Blanco Mulero, Markus Heinonen, Ville Kyrki, Intelligent Robotics, Probabilistic Machine Learning, Department of Electrical Engineering and Automation, Department of Computer Science, Aalto-yliopisto, and Aalto University
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Statistics - Other Statistics, Time series, Graph-based learning, probabilistic models, Statistics - Machine Learning, Other Statistics (stat.OT), gaussian process, Machine Learning (stat.ML), Machine Learning (cs.LG), MathematicsofComputing_DISCRETEMATHEMATICS
Abstract: Graph Gaussian Processes (GGPs) provide a data-efficient solution on graph structured domains. Existing approaches have focused on static structures, whereas many real graph data represent a dynamic structure, limiting the applications of GGPs. To overcome this we propose evolving-Graph Gaussian Processes (e-GGPs). The proposed method is capable of learning the transition function of graph vertices over time with a neighbourhood kernel to model the connectivity and interaction changes between vertices. We assess the performance of our method on time-series regression problems where graphs evolve over time. We demonstrate the benefits of e-GGPs over static graph Gaussian Process approaches., 12 pages, 5 figures. Accepted for publication at ICML 2021 Time Series Workshop (TSW)
Published: 2021

47. Deep Convolutional Gaussian Processes

Author: Kenneth Blomqvist, Markus Heinonen, and Samuel Kaski
Subjects: Current (mathematics), Contextual image classification, Computer science, business.industry, Structure (category theory), Pattern recognition, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, symbols.namesake, Computer Science::Computer Vision and Pattern Recognition, 0202 electrical engineering, electronic engineering, information engineering, symbols, 020201 artificial intelligence & image processing, Bayesian framework, Artificial intelligence, business, Gaussian process, MNIST database, 0105 earth and related environmental sciences
Abstract: We propose deep convolutional Gaussian processes, a deep Gaussian process architecture with convolutional structure. The model is a principled Bayesian framework for detecting hierarchical combinations of local features for image classification. We demonstrate greatly improved image classification performance compared to current convolutional Gaussian process approaches on the MNIST and CIFAR-10 datasets. In particular, we improve state-of-the-art CIFAR-10 accuracy by over 10% points.
Published: 2020

48. mGPfusion: predicting protein stability changes with Gaussian process kernel learning and data fusion

Author: Harri Lähdesmäki, Emmi Jokinen, and Markus Heinonen
Subjects: FOS: Computer and information sciences, 0301 basic medicine, Statistics and Probability, Computer science, Protein design, Stability (learning theory), Machine Learning (stat.ML), Machine learning, computer.software_genre, Quantitative Biology - Quantitative Methods, Biochemistry, 03 medical and health sciences, symbols.namesake, Protein stability, Statistics - Machine Learning, Molecular Biology, Gaussian process, Quantitative Methods (q-bio.QM), ta113, Ismb 2018–Intelligent Systems for Molecular Biology Proceedings, 030102 biochemistry & molecular biology, Protein Stability, business.industry, Computational Biology, Proteins, Biomolecules (q-bio.BM), Bayes Theorem, Macromolecular Sequence, Structure, and Function, Sensor fusion, Computer Science Applications, Computational Mathematics, 030104 developmental biology, Quantitative Biology - Biomolecules, Computational Theory and Mathematics, FOS: Biological sciences, Kernel (statistics), Mutation, symbols, Graph (abstract data type), Artificial intelligence, business, computer, Algorithms, Software
Abstract: Motivation Proteins are commonly used by biochemical industry for numerous processes. Refining these proteins’ properties via mutations causes stability effects as well. Accurate computational method to predict how mutations affect protein stability is necessary to facilitate efficient protein design. However, accuracy of predictive models is ultimately constrained by the limited availability of experimental data. Results We have developed mGPfusion, a novel Gaussian process (GP) method for predicting protein’s stability changes upon single and multiple mutations. This method complements the limited experimental data with large amounts of molecular simulation data. We introduce a Bayesian data fusion model that re-calibrates the experimental and in silico data sources and then learns a predictive GP model from the combined data. Our protein-specific model requires experimental data only regarding the protein of interest and performs well even with few experimental measurements. The mGPfusion models proteins by contact maps and infers the stability effects caused by mutations with a mixture of graph kernels. Our results show that mGPfusion outperforms state-of-the-art methods in predicting protein stability on a dataset of 15 different proteins and that incorporating molecular simulation data improves the model learning and prediction accuracy. Availability and implementation Software implementation and datasets are available at github.com/emmijokinen/mgpfusion. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2018

49. Learning with multiple pairwise kernels for drug bioactivity prediction

Author: Heli Julkunen, Tapio Pahikkala, Tero Aittokallio, Anna Cichonska, Juho Rousu, Antti Airola, Sandor Szedmak, Markus Heinonen, Institute for Molecular Medicine Finland, Doctoral Programme in Integrative Life Science, Doctoral Programme in Drug Research, Tero Aittokallio / Principal Investigator, Bioinformatics, Department of Computer Science, University of Turku, Aalto University, and Aalto-yliopisto
Subjects: 0301 basic medicine, Support Vector Machine, Computer science, Inference, 02 engineering and technology, computer.software_genre, Biochemistry, DATA INTEGRATION, Neoplasms, Drug Discovery, 111 Mathematics, METABOLITE IDENTIFICATION, ALGORITHMS, Computer Science Applications, Computational Mathematics, ALIGNMENT, Treatment Outcome, Computational Theory and Mathematics, Kernel (statistics), Systems Biology and Networks, SENSITIVITY, Signal Transduction, Statistics and Probability, DATABASE, 0206 medical engineering, Antineoplastic Agents, Machine learning, ta3111, Bottleneck, 03 medical and health sciences, Kernel (linear algebra), Cell Line, Tumor, REGRESSION, Humans, Molecular Biology, Ismb 2018–Intelligent Systems for Molecular Biology Proceedings, ta113, Multiple kernel learning, business.industry, ta111, ta1182, Computational Biology, Function (mathematics), 113 Computer and information sciences, Support vector machine, 030104 developmental biology, ComputingMethodologies_PATTERNRECOGNITION, DISCOVERY, 1182 Biochemistry, cell and molecular biology, Pairwise comparison, Artificial intelligence, business, computer, Protein Kinases, 020602 bioinformatics, Software
Abstract: Motivation Many inference problems in bioinformatics, including drug bioactivity prediction, can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as it enables integrating various types of complex biomedical information sources in the form of kernels, along with learning their importance for the prediction task. However, the immense size of pairwise kernel spaces remains a major bottleneck, making the existing MKL algorithms computationally infeasible even for small number of input pairs. Results We introduce pairwiseMKL, the first method for time- and memory-efficient learning with multiple pairwise kernels. pairwiseMKL first determines the mixture weights of the input pairwise kernels, and then learns the pairwise prediction function. Both steps are performed efficiently without explicit computation of the massive pairwise matrices, therefore making the method applicable to solving large pairwise learning problems. We demonstrate the performance of pairwiseMKL in two related tasks of quantitative drug bioactivity prediction using up to 167 995 bioactivity measurements and 3120 pairwise kernels: (i) prediction of anticancer efficacy of drug compounds across a large panel of cancer cell lines; and (ii) prediction of target profiles of anticancer compounds across their kinome-wide target spaces. We show that pairwiseMKL provides accurate predictions using sparse solutions in terms of selected kernels, and therefore it automatically identifies also data sources relevant for the prediction problem. Availability and implementation Code is available at https://github.com/aalto-ics-kepaco. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2018

50. Determining epitope specificity of T cell receptors with TCRGP

Author: Jani Huuhtanen, Harri Lähdesmäki, Markus Heinonen, Emmi Jokinen, and Satu Mustjoki
Subjects: 0303 health sciences, Immune status, T-cell receptor, RNA, hemic and immune systems, chemical and pharmacologic phenomena, Computational biology, Biology, Phenotype, Epitope, 3. Good health, Transcriptome, 03 medical and health sciences, 0302 clinical medicine, Immune system, 030220 oncology & carcinogenesis, Epitope specificity, 030304 developmental biology
Abstract: T cell receptors (TCRs) can recognize various pathogens and consequently start immune responses. TCRs can be sequenced from individuals and methods analyzing the specificity of the TCRs can help us better understand individuals’ immune status in different diseases. We have developed TCRGP, a novel Gaussian process method to predict if TCRs recognize certain epitopes. This method can utilize CDR sequences from TCRα and TCRβ chains and learn which CDRs are important in recognizing different epitopes. We have experimented with with epitope-specific data against 29 epitopes and performed a comprehensive evaluation with existing prediction methods. On this data, TCRGP outperforms other state-of-the-art methods in epitope-specificity predictions. We also propose a novel analysis approach for combined single-cell RNA and TCRαβ (scRNA+TCRαβ) sequencing data by quantifying epitope-specific TCRs with TCRGP in phenotypes identified from scRNA-seq data. With this approach, we find HBV-epitope specific T cells and their transcriptomic states in hepatocellular carcinoma patients.
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

67 results on '"Markus Heinonen"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources