Author: "Matthias Samwald" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Matthias Samwald"' showing total 127 results

Start Over Author "Matthias Samwald"

127 results on '"Matthias Samwald"'

1. A comparison of chain-of-thought reasoning strategies across datasets and models

Author: Konstantin Hebenstreit, Robert Praas, Louis P. Kiesewetter, and Matthias Samwald
Subjects: Chain-of-thought reasoning, Large language models, Externalized reasoning, Zero-shot prompting, Question-answering datasets, Electronic computers. Computer science, QA75.5-76.95
Abstract: Emergent chain-of-thought (CoT) reasoning capabilities promise to improve the performance and explainability of large language models (LLMs). However, uncertainties remain about how reasoning strategies formulated for previous model generations generalize to new model generations and different datasets. In this small-scale study, we compare different reasoning strategies induced by zero-shot prompting across six recently released LLMs (davinci-002, davinci-003, GPT-3.5-turbo, GPT-4, Flan-T5-xxl and Cohere command-xlarge). We test them on six question-answering datasets that require real-world knowledge application and logical verbal reasoning, including datasets from scientific and medical domains. Our findings demonstrate that while some variations in effectiveness occur, gains from CoT reasoning strategies remain robust across different models and datasets. GPT-4 benefits the most from current state-of-the-art reasoning strategies and performs best by applying a prompt previously discovered through automated discovery.
Published: 2024
Full Text: View/download PDF

2. ThoughtSource: A central hub for large language model reasoning data

Author: Simon Ott, Konstantin Hebenstreit, Valentin Liévin, Christoffer Egeberg Hother, Milad Moradi, Maximilian Mayrhauser, Robert Praas, Ole Winther, and Matthias Samwald
Subjects: Science
Abstract: Abstract Large language models (LLMs) such as GPT-4 have recently demonstrated impressive results across a wide range of tasks. LLMs are still limited, however, in that they frequently fail at complex reasoning, their reasoning processes are opaque, they are prone to ‘hallucinate’ facts, and there are concerns about their underlying biases. Letting models verbalize reasoning steps as natural language, a technique known as chain-of-thought prompting, has recently been proposed as a way to address some of these issues. Here we present ThoughtSource, a meta-dataset and software library for chain-of-thought (CoT) reasoning. The goal of ThoughtSource is to improve future artificial intelligence systems by facilitating qualitative understanding of CoTs, enabling empirical evaluations, and providing training data. This first release of ThoughtSource integrates seven scientific/medical, three general-domain and five math word question answering datasets.
Published: 2023
Full Text: View/download PDF

3. Mapping global dynamics of benchmark creation and saturation in artificial intelligence

Author: Simon Ott, Adriano Barbosa-Silva, Kathrin Blagec, Jan Brauner, and Matthias Samwald
Subjects: Science
Abstract: Recent studies raised concerns over the state of AI benchmarking, reporting issues such as benchmark overfitting, benchmark saturation and increasing centralization of benchmark dataset creation. To facilitate monitoring of the health of the AI benchmarking ecosystem, the authors introduce methodologies for creating condensed maps of the global dynamics of benchmark.
Published: 2022
Full Text: View/download PDF

4. A curated, ontology-based, large-scale knowledge graph of artificial intelligence tasks and benchmarks

Author: Kathrin Blagec, Adriano Barbosa-Silva, Simon Ott, and Matthias Samwald
Subjects: Science
Abstract: Measurement(s) Artificial Intelligence • Benchmark Technology Type(s) digital curation Sample Characteristic - Location Globally
Published: 2022
Full Text: View/download PDF

5. The roles of predictors in cardiovascular risk models - a question of modeling culture?

Author: Christine Wallisch, Asan Agibetov, Daniela Dunkler, Maria Haller, Matthias Samwald, Georg Dorffner, and Georg Heinze
Subjects: Cardiovascular risk, Prediction model, Predictors, Non-linear effect, Partial dependence plots, Medicine (General), R5-920
Abstract: Abstract Background While machine learning (ML) algorithms may predict cardiovascular outcomes more accurately than statistical models, their result is usually not representable by a transparent formula. Hence, it is often unclear how specific values of predictors lead to the predictions. We aimed to demonstrate with graphical tools how predictor-risk relations in cardiovascular risk prediction models fitted by ML algorithms and by statistical approaches may differ, and how sample size affects the stability of the estimated relations. Methods We reanalyzed data from a large registry of 1.5 million participants in a national health screening program. Three data analysts developed analytical strategies to predict cardiovascular events within 1 year from health screening. This was done for the full data set and with gradually reduced sample sizes, and each data analyst followed their favorite modeling approach. Predictor-risk relations were visualized by partial dependence and individual conditional expectation plots. Results When comparing the modeling algorithms, we found some similarities between these visualizations but also occasional divergence. The smaller the sample size, the more the predictor-risk relation depended on the modeling algorithm used, and also sampling variability played an increased role. Predictive performance was similar if the models were derived on the full data set, whereas smaller sample sizes favored simpler models. Conclusion Predictor-risk relations from ML models may differ from those obtained by statistical models, even with large sample sizes. Hence, predictors may assume different roles in risk prediction models. As long as sample size is sufficient, predictive accuracy is not largely affected by the choice of algorithm.
Published: 2021
Full Text: View/download PDF

6. Pharmacogenomics decision support in the U-PGx project: Results and advice from clinical implementation across seven European countries

Author: Kathrin Blagec, Jesse J. Swen, Rudolf Koopmann, Ka-Chun Cheung, Mandy Crommentuijn - van Rhenen, Inge Holsappel, Lidija Konta, Simon Ott, Daniela Steinberger, Hong Xu, Erika Cecchin, Vita Dolžan, Cristina Lucía Dávila-Fajardo, George P. Patrinos, Gere Sunder-Plassmann, Richard M. Turner, Munir Pirmohamed, Henk-Jan Guchelaar, Matthias Samwald, and Ubiquitous Pharmacogenomics Consortium
Subjects: Medicine, Science
Abstract: Background The clinical implementation of pharmacogenomics (PGx) could be one of the first milestones towards realizing personalized medicine in routine care. However, its widespread adoption requires the availability of suitable clinical decision support (CDS) systems, which is often impeded by the fragmentation or absence of adequate health IT infrastructures. We report results of CDS implementation in the large-scale European research project Ubiquitous Pharmacogenomics (U-PGx), in which PGx CDS was rolled out and evaluated across more than 15 clinical sites in the Netherlands, Spain, Slovenia, Italy, Greece, United Kingdom and Austria, covering a wide variety of healthcare settings. Methods We evaluated the CDS implementation process through qualitative and quantitative process indicators. Quantitative indicators included statistics on generated PGx reports, median time from sampled upload until report delivery and statistics on report retrievals via the mobile-based CDS tool. Adoption of different CDS tools, uptake and usability were further investigated through a user survey among healthcare providers. Results of a risk assessment conducted prior to the implementation process were retrospectively analyzed and compared to actual encountered difficulties and their impact. Results As of March 2021, personalized PGx reports were produced from 6884 genotyped samples with a median delivery time of twenty minutes. Out of 131 invited healthcare providers, 65 completed the questionnaire (response rate: 49.6%). Overall satisfaction rates with the different CDS tools varied between 63.6% and 85.2% per tool. Delays in implementation were caused by challenges including institutional factors and complexities in the development of required tools and reference data resources, such as genotype-phenotype mappings. Conclusions We demonstrated the feasibility of implementing a standardized PGx decision support solution in a multinational, multi-language and multi-center setting. Remaining challenges for future wide-scale roll-out include the harmonization of existing PGx information in guidelines and drug labels, the need for strategies to lower the barrier of PGx CDS adoption for healthcare institutions and providers, and easier compliance with regulatory and legal frameworks.
Published: 2022

7. Neural sentence embedding models for semantic similarity estimation in the biomedical domain

Author: Kathrin Blagec, Hong Xu, Asan Agibetov, and Matthias Samwald
Subjects: Natural language processing, Semantics, Neural embedding models, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Neural network based embedding models are receiving significant attention in the field of natural language processing due to their capability to effectively capture semantic information representing words, sentences or even larger text elements in low-dimensional vector space. While current state-of-the-art models for assessing the semantic similarity of textual statements from biomedical publications depend on the availability of laboriously curated ontologies, unsupervised neural embedding models only require large text corpora as input and do not need manual curation. In this study, we investigated the efficacy of current state-of-the-art neural sentence embedding models for semantic similarity estimation of sentences from biomedical literature. We trained different neural embedding models on 1.7 million articles from the PubMed Open Access dataset, and evaluated them based on a biomedical benchmark set containing 100 sentence pairs annotated by human experts and a smaller contradiction subset derived from the original benchmark set. Results Experimental results showed that, with a Pearson correlation of 0.819, our best unsupervised model based on the Paragraph Vector Distributed Memory algorithm outperforms previous state-of-the-art results achieved on the BIOSSES biomedical benchmark set. Moreover, our proposed supervised model that combines different string-based similarity metrics with a neural embedding model surpasses previous ontology-dependent supervised state-of-the-art approaches in terms of Pearson’s r (r = 0.871) on the biomedical benchmark set. In contrast to the promising results for the original benchmark, we found our best models’ performance on the smaller contradiction subset to be poor. Conclusions In this study, we have highlighted the value of neural network-based models for semantic similarity estimation in the biomedical domain by showing that they can keep up with and even surpass previous state-of-the-art approaches for semantic similarity estimation that depend on the availability of laboriously curated ontologies, when evaluated on a biomedical benchmark set. Capturing contradictions and negations in biomedical sentences, however, emerged as an essential area for further work.
Published: 2019
Full Text: View/download PDF

8. Fast and scalable neural embedding models for biomedical sentence classification

Author: Asan Agibetov, Kathrin Blagec, Hong Xu, and Matthias Samwald
Subjects: Natural language processing, Text classification, Neural networks, Word vector models, FastText, Scientific abstracts, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Biomedical literature is expanding rapidly, and tools that help locate information of interest are needed. To this end, a multitude of different approaches for classifying sentences in biomedical publications according to their coarse semantic and rhetoric categories (e.g., Background, Methods, Results, Conclusions) have been devised, with recent state-of-the-art results reported for a complex deep learning model. Recent evidence showed that shallow and wide neural models such as fastText can provide results that are competitive or superior to complex deep learning models while requiring drastically lower training times and having better scalability. We analyze the efficacy of the fastText model in the classification of biomedical sentences in the PubMed 200k RCT benchmark, and introduce a simple pre-processing step that enables the application of fastText on sentence sequences. Furthermore, we explore the utility of two unsupervised pre-training approaches in scenarios where labeled training data are limited. Results Our fastText-based methodology yields a state-of-the-art F1 score of.917 on the PubMed 200k benchmark when sentence ordering is taken into account, with a training time of only 73 s on standard hardware. Applying fastText on single sentences, without taking sentence ordering into account, yielded an F1 score of.852 (training time 13 s). Unsupervised pre-training of N-gram vectors greatly improved the results for small training set sizes, with an increase of F1 score of.21 to.74 when trained on only 1000 randomly picked sentences without taking sentence ordering into account. Conclusions Because of it’s ease of use and performance, fastText should be among the first choices of tools when tackling biomedical text classification problems with large corpora. Unsupervised pre-training of N-gram vectors on domain-specific corpora also makes it possible to apply fastText when labeled training data are limited.
Published: 2018
Full Text: View/download PDF

9. Crowdsourced assessment of common genetic contribution to predicting anti-TNF treatment response in rheumatoid arthritis

Author: Solveig K. Sieberts, Fan Zhu, Javier García-García, Eli Stahl, Abhishek Pratap, Gaurav Pandey, Dimitrios Pappas, Daniel Aguilar, Bernat Anton, Jaume Bonet, Ridvan Eksi, Oriol Fornés, Emre Guney, Hongdong Li, Manuel Alejandro Marín, Bharat Panwar, Joan Planas-Iglesias, Daniel Poglayen, Jing Cui, Andre O. Falcao, Christine Suver, Bruce Hoff, Venkat S. K. Balagurusamy, Donna Dillenberger, Elias Chaibub Neto, Thea Norman, Tero Aittokallio, Muhammad Ammad-ud-din, Chloe-Agathe Azencott, Víctor Bellón, Valentina Boeva, Kerstin Bunte, Himanshu Chheda, Lu Cheng, Jukka Corander, Michel Dumontier, Anna Goldenberg, Peddinti Gopalacharyulu, Mohsen Hajiloo, Daniel Hidru, Alok Jaiswal, Samuel Kaski, Beyrem Khalfaoui, Suleiman Ali Khan, Eric R. Kramer, Pekka Marttinen, Aziz M. Mezlini, Bhuvan Molparia, Matti Pirinen, Janna Saarela, Matthias Samwald, Véronique Stoven, Hao Tang, Jing Tang, Ali Torkamani, Jean-Phillipe Vert, Bo Wang, Tao Wang, Krister Wennerberg, Nathan E. Wineinger, Guanghua Xiao, Yang Xie, Rae Yeung, Xiaowei Zhan, Cheng Zhao, Members of the Rheumatoid Arthritis Challenge Consortium, Jeff Greenberg, Joel Kremer, Kaleb Michaud, Anne Barton, Marieke Coenen, Xavier Mariette, Corinne Miceli, Nancy Shadick, Michael Weinblatt, Niek de Vries, Paul P. Tak, Danielle Gerlag, Tom W. J. Huizinga, Fina Kurreeman, Cornelia F. Allaart, S. Louis Bridges Jr., Lindsey Criswell, Larry Moreland, Lars Klareskog, Saedis Saevarsdottir, Leonid Padyukov, Peter K. Gregersen, Stephen Friend, Robert Plenge, Gustavo Stolovitzky, Baldo Oliva, Yuanfang Guan, and Lara M. Mangravite
Subjects: Science
Abstract: Rheumatoid arthritis patients respond differently to anti-TNF treatment. Using community-based challenge, the authors show that currently available data does not reveal meaningful genetic predictors of response to anti-TNF therapy, thus confirming clinical observations.
Published: 2016
Full Text: View/download PDF

10. Examining perceptions of the usefulness and usability of a mobile-based system for pharmacogenomics clinical decision support: a mixed methods study

Author: Kathrin Blagec, Katrina M. Romagnoli, Richard D. Boyce, and Matthias Samwald
Subjects: Pharmacogenetics, Individualized medicine, Decision support systems, User studies, Clinical, Mixed methods studies, Medicine, Biology (General), QH301-705.5
Abstract: Background. Pharmacogenomic testing has the potential to improve the safety and efficacy of pharmacotherapy, but clinical application of pharmacogenetic knowledge has remained uncommon. Clinical Decision Support (CDS) systems could help overcome some of the barriers to clinical implementation. The aim of this study was to evaluate the perception and usability of a web- and mobile-enabled CDS system for pharmacogenetics-guided drug therapy–the Medication Safety Code (MSC) system–among potential users (i.e., physicians and pharmacists). Furthermore, this study sought to collect data on the practicability and comprehensibility of potential layouts of a proposed personalized pocket card that is intended to not only contain the machine-readable data for use with the MSC system but also human-readable data on the patient’s pharmacogenomic profile. Methods. We deployed an emergent mixed methods design encompassing (1) qualitative interviews with pharmacists and pharmacy students, (2) a survey among pharmacogenomics experts that included both qualitative and quantitative elements and (3) a quantitative survey among physicians and pharmacists. The interviews followed a semi-structured guide including a hypothetical patient scenario that had to be solved by using the MSC system. The survey among pharmacogenomics experts focused on what information should be printed on the card and how this information should be arranged. Furthermore, the MSC system was evaluated based on two hypothetical patient scenarios and four follow-up questions on the perceived usability. The second survey assessed physicians’ and pharmacists’ attitude towards the MSC system. Results. In total, 101 physicians, pharmacists and PGx experts coming from various relevant fields evaluated the MSC system. Overall, the reaction to the MSC system was positive across all investigated parameters and among all user groups. The majority of participants were able to solve the patient scenarios based on the recommendations displayed on the MSC interface. A frequent request among participants was to provide specific listings of alternative drugs and concrete dosage instructions. Negligence of other patient-specific factors for choosing the right treatment such as renal function and co-medication was a common concern related to the MSC system, while data privacy and cost-benefit considerations emerged as the participants’ major concerns regarding pharmacogenetic testing in general. The results of the card layout evaluation indicate that a gene-centered and tabulated presentation of the patient’s pharmacogenomic profile is helpful and well-accepted. Conclusions. We found that the MSC system was well-received among the physicians and pharmacists included in this study. A personalized pocket card that lists a patient’s metabolizer status along with critically affected drugs can alert physicians and pharmacists to the availability of essential therapy modifications.
Published: 2016
Full Text: View/download PDF

11. Incidence of Exposure of Patients in the United States to Multiple Drugs for Which Pharmacogenomic Guidelines Are Available.

Author: Matthias Samwald, Hong Xu, Kathrin Blagec, Philip E Empey, Daniel C Malone, Seid Mussa Ahmed, Patrick Ryan, Sebastian Hofer, and Richard D Boyce
Subjects: Medicine, Science
Abstract: Pre-emptive pharmacogenomic (PGx) testing of a panel of genes may be easier to implement and more cost-effective than reactive pharmacogenomic testing if a sufficient number of medications are covered by a single test and future medication exposure can be anticipated. We analysed the incidence of exposure of individual patients in the United States to multiple drugs for which pharmacogenomic guidelines are available (PGx drugs) within a selected four-year period (2009-2012) in order to identify and quantify the incidence of pharmacotherapy in a nation-wide patient population that could be impacted by pre-emptive PGx testing based on currently available clinical guidelines. In total, 73 024 095 patient records from private insurance, Medicare Supplemental and Medicaid were included. Patients enrolled in Medicare Supplemental age > = 65 or Medicaid age 40-64 had the highest incidence of PGx drug use, with approximately half of the patients receiving at least one PGx drug during the 4 year period and one fourth to one third of patients receiving two or more PGx drugs. These data suggest that exposure to multiple PGx drugs is common and that it may be beneficial to implement wide-scale pre-emptive genomic testing. Future work should therefore concentrate on investigating the cost-effectiveness of multiplexed pre-emptive testing strategies.
Published: 2016
Full Text: View/download PDF

12. Structured digital tables on the Semantic Web: toward a structured digital literature

Author: Kei‐Hoi Cheung, Matthias Samwald, Raymond K Auerbach, and Mark B Gerstein
Subjects: bioinformatics, data integration, semantic publishing, Semantic Web, triplification, Biology (General), QH301-705.5, Medicine (General), R5-920
Abstract: Abstract In parallel to the growth in bioscience databases, biomedical publications have increased exponentially in the past decade. However, the extraction of high‐quality information from the corpus of scientific literature has been hampered by the lack of machine‐interpretable content, despite text‐mining advances. To address this, we propose creating a structured digital table as part of an overall effort in developing machine‐readable, structured digital literature. In particular, we envision transforming publication tables into standardized triples using Semantic Web approaches. We identify three canonical types of tables (conveying information about properties, networks, and concept hierarchies) and show how more complex tables can be built from these basic types. We envision that authors would create tables initially using the structured triples for canonical types and then have them visually rendered for publication, and we present examples for converting representative tables into triples. Finally, we discuss how ‘stub’ versions of structured digital tables could be a useful bridge for connecting together the literature with databases, allowing the former to more precisely document the later.
Published: 2010
Full Text: View/download PDF

13. An ontology-based, mobile-optimized system for pharmacogenomic decision support at the point-of-care.

Author: Jose Antonio Miñarro-Giménez, Kathrin Blagec, Richard D Boyce, Klaus-Peter Adlassnig, and Matthias Samwald
Subjects: Medicine, Science
Abstract: BACKGROUND:The development of genotyping and genetic sequencing techniques and their evolution towards low costs and quick turnaround have encouraged a wide range of applications. One of the most promising applications is pharmacogenomics, where genetic profiles are used to predict the most suitable drugs and drug dosages for the individual patient. This approach aims to ensure appropriate medical treatment and avoid, or properly manage, undesired side effects. RESULTS:We developed the Medicine Safety Code (MSC) service, a novel pharmacogenomics decision support system, to provide physicians and patients with the ability to represent pharmacogenomic data in computable form and to provide pharmacogenomic guidance at the point-of-care. Pharmacogenomic data of individual patients are encoded as Quick Response (QR) codes and can be decoded and interpreted with common mobile devices without requiring a centralized repository for storing genetic patient data. In this paper, we present the first fully functional release of this system and describe its architecture, which utilizes Web Ontology Language 2 (OWL 2) ontologies to formalize pharmacogenomic knowledge and to provide clinical decision support functionalities. CONCLUSIONS:The MSC system provides a novel approach for enabling the implementation of personalized medicine in clinical routine.
Published: 2014
Full Text: View/download PDF

14. Erratum: Crowdsourced assessment of common genetic contribution to predicting anti-TNF treatment response in rheumatoid arthritis

Author: Solveig K. Sieberts, Fan Zhu, Javier García-García, Eli Stahl, Abhishek Pratap, Gaurav Pandey, Dimitrios Pappas, Daniel Aguilar, Bernat Anton, Jaume Bonet, Ridvan Eksi, Oriol Fornés, Emre Guney, Hongdong Li, Manuel Alejandro Marín, Bharat Panwar, Joan Planas-Iglesias, Daniel Poglayen, Jing Cui, Andre O. Falcao, Christine Suver, Bruce Hoff, Venkat S. K. Balagurusamy, Donna Dillenberger, Elias Chaibub Neto, Thea Norman, Tero Aittokallio, Muhammad Ammad-ud-din, Chloe-Agathe Azencott, Víctor Bellón, Valentina Boeva, Kerstin Bunte, Himanshu Chheda, Lu Cheng, Jukka Corander, Michel Dumontier, Anna Goldenberg, Peddinti Gopalacharyulu, Mohsen Hajiloo, Daniel Hidru, Alok Jaiswal, Samuel Kaski, Beyrem Khalfaoui, Suleiman Ali Khan, Eric R. Kramer, Pekka Marttinen, Aziz M. Mezlini, Bhuvan Molparia, Matti Pirinen, Janna Saarela, Matthias Samwald, Véronique Stoven, Hao Tang, Jing Tang, Ali Torkamani, Jean-Phillipe Vert, Bo Wang, Tao Wang, Krister Wennerberg, Nathan E. Wineinger, Guanghua Xiao, Yang Xie, Rae Yeung, Xiaowei Zhan, Cheng Zhao, Members of the Rheumatoid Arthritis Challenge Consortium, Jeff Greenberg, Joel Kremer, Kaleb Michaud, Anne Barton, Marieke Coenen, Xavier Mariette, Corinne Miceli, Nancy Shadick, Michael Weinblatt, Niek de Vries, Paul P. Tak, Danielle Gerlag, Tom W. J. Huizinga, Fina Kurreeman, Cornelia F. Allaart, S. Louis Bridges Jr., Lindsey Criswell, Larry Moreland, Lars Klareskog, Saedis Saevarsdottir, Leonid Padyukov, Peter K. Gregersen, Stephen Friend, Robert Plenge, Gustavo Stolovitzky, Baldo Oliva, Yuanfang Guan, and Lara M. Mangravite
Subjects: Science
Abstract: Nature Communications 7: Article number: 12460 (2016); Published: 23 August 2016; Updated: 10 October 2016. The HTML version of this Article incorrectly duplicated the authors S. Louis Bridges, Lindsey Criswell, Larry Moreland, Lars Klareskog, Saedis Saevarsdottir, Leonid Padyukov, Peter K. Gregersen, Stephen Friend, Robert Plenge, Gustavo Stolovitzky, Baldo Oliva, Yuanfang Guan and Lara M.
Published: 2016
Full Text: View/download PDF

15. CSMeD: Bridging the Dataset Gap in Automated Citation Screening for Systematic Literature Reviews.

Author: Wojciech Kusa, óscar E. Mendoza, Matthias Samwald, Petr Knoth, and Allan Hanbury
Published: 2023

16. Evaluating the Robustness of Neural Language Models to Input Perturbations.

Author: Milad Moradi and Matthias Samwald
Published: 2021
Full Text: View/download PDF

17. Dividing the Ontology Alignment Task with Semantic Embeddings and Logic-Based Modules.

Author: Ernesto Jiménez-Ruiz, Asan Agibetov, Jiaoyan Chen 0001, Matthias Samwald, and Valerie Cross
Published: 2020
Full Text: View/download PDF

18. BigBio: A Framework for Data-Centric Biomedical Natural Language Processing.

Author: Jason A. Fries, Leon Weber, Natasha Seelam, Gabriel Altay, Debajyoti Datta, Samuele Garda, Sunny Kang, Rosaline Su, Wojciech Kusa, Samuel Cahyawijaya, Fabio Barth, Simon Ott, Matthias Samwald, Stephen H. Bach, Stella Biderman, Mario Sänger, Bo Wang 0044, Alison Callahan, Daniel León Periñán, Théo Gigant, Patrick Haller 0002, Jenny Chim, José D. Posada, John M. Giorgi, Karthik Rangasai Sivaraman, Marc Pàmies, Marianna Nezhurina, Robert Martin, Michael Cullan, Moritz Freidank, Nathan Dahlberg, Shubhanshu Mishra, Shamik Bose, Nicholas Broad, Yanis Labrak, Shlok Deshmukh, Sid Kiblawi, Ayush Singh, Minh Chien Vu, Trishala Neeraj, Jonas Golde, Albert Villanova del Moral, and Benjamin Beilharz
Published: 2022

19. Using hyperbolic large-margin classifiers for biological link prediction.

Author: Asan Agibetov, Georg Dorffner, and Matthias Samwald
Published: 2019

20. SAFRAN: An interpretable, rule-based link prediction method outperforming embedding models.

Author: Simon Ott, Christian Meilicke, and Matthias Samwald
Published: 2021
Full Text: View/download PDF

21. Effects of Medical Device Regulations on the Development of Stand-Alone Medical Software: A Pilot Study.

Author: Kathrin Blagec, David Jungwirth, Daniela Haluza, and Matthias Samwald
Published: 2018
Full Text: View/download PDF

22. Fast and Scalable Learning of Neuro-Symbolic Representations of Biomedical Knowledge.

Author: Asan Agibetov and Matthias Samwald
Published: 2018

23. Global and Local Evaluation of Link Prediction Tasks with Neural Embeddings.

Author: Asan Agibetov and Matthias Samwald
Published: 2018

24. We divide, you conquer: from large-scale ontology alignment to manageable subtasks with a lexical index and neural embeddings.

Author: Ernesto Jiménez-Ruiz, Asan Agibetov, Matthias Samwald, and Valerie Cross
Published: 2018

25. The Importance of Gene-Drug-Drug-Interactions in Pharmacogenomics Decision Support: An Analysis Based on Austrian Claims Data.

Author: Kathrin Blagec, Wolfgang Kuch, and Matthias Samwald
Published: 2017
Full Text: View/download PDF

26. How Many Patients Could Benefit From Pre-emptive Pharmacogenomic Testing and Decision Support? A Retrospective Study Based on Nationwide Austrian Claims Data.

Author: Wolfgang Kuch, Christoph Rinner, Walter Gall, and Matthias Samwald
Published: 2016
Full Text: View/download PDF

27. Towards a Global IT System for Personalized Medicine: the Medicine Safety Code Initiative.

Author: Matthias Samwald, José Antonio Miñarro-Giménez, Kathrin Blagec, and Klaus-Peter Adlassnig
Published: 2014
Full Text: View/download PDF

28. An Update on Genomic CDS, a Complex Ontology for Pharmacogenomics and Clinical Decision Support.

Author: José Antonio Miñarro-Giménez and Matthias Samwald
Published: 2014

29. An Open-Source, Mobile-Friendly Search Engine for Public Medical Knowledge.

Author: Matthias Samwald and Allan Hanbury
Published: 2014
Full Text: View/download PDF

30. Exploring the Application of Deep Learning Techniques on Medical Text Corpora.

Author: José Antonio Miñarro-Giménez, Oscar Marín-Alonso, and Matthias Samwald
Published: 2014
Full Text: View/download PDF

31. Towards Unified Objectives for Self-Reflective AI

Author: Matthias Samwald, Robert Praas, and Konstantin Hebenstreit
Subjects: History, Polymers and Plastics, Business and International Management, Industrial and Manufacturing Engineering
Published: 2023
Full Text: View/download PDF

32. A Formative Evaluation of a Comprehensive Search System for Medical Professionals.

Author: Veronika Stefanov, Alexander Sachs, Marlene Kritz, Matthias Samwald, Manfred Gschwandtner, and Allan Hanbury
Published: 2013
Full Text: View/download PDF

33. Genomic CDS: an Example of a Complex Ontology for Pharmacogenetics and Clinical Decision Support.

Author: Matthias Samwald
Published: 2013

34. An RDF/OWL Knowledge Base for Query Answering and Decision Support in Clinical Pharmacogenetics.

Author: Matthias Samwald, Robert R. Freimuth, Joanne S. Luciano, Simon M. Lin, Robert L. Powers, M. Scott Marshall, Klaus-Peter Adlassnig, Michel Dumontier, and Richard D. Boyce
Published: 2013
Full Text: View/download PDF

35. Towards an Interoperable Information Infrastructure Providing Decision Support for Genomic Medicine.

Author: Matthias Samwald, Holger Stenzhorn, Michel Dumontier, M. Scott Marshall, Joanne Luciano, and Klaus-Peter Adlassnig
Published: 2011
Full Text: View/download PDF

36. Entrez Neuron RDFa: A Pragmatic Semantic Web Application for Data Integration in Neuroscience Research.

Author: Matthias Samwald, Ernest Lim, Peter Masiar, Luis N. Marenco, Huajun Chen, Thomas M. Morse, Pradeep Mutalik, Gordon M. Shepherd, Perry L. Miller, and Kei-Hoi Cheung
Published: 2009
Full Text: View/download PDF

37. Das Semantic Web als Werkzeug in der biomedizinischen Forschung.

Author: Holger Stenzhorn and Matthias Samwald
Published: 2009
Full Text: View/download PDF

38. Previewing Semantic Web Pipes.

Author: Christian Morbidoni, Danh Le Phuoc, Axel Polleres, Matthias Samwald, and Giovanni Tummarello
Published: 2008
Full Text: View/download PDF

39. Simplifying Access to Large-Scale Health Care and Life Sciences Datasets.

Author: Holger Stenzhorn, Kavitha Srinivas, Matthias Samwald, and Alan Ruttenberg
Published: 2008
Full Text: View/download PDF

40. Explaining Black-Box Models for Biomedical Text Classification

Author: Milad Moradi and Matthias Samwald
Subjects: FOS: Computer and information sciences, Source code, Computer Science - Artificial Intelligence, Computer science, media_common.quotation_subject, Fidelity, computer.software_genre, Semantics, Data modeling, Machine Learning, Set (abstract data type), Health Information Management, Data Mining, Humans, Electrical and Electronic Engineering, Interpretability, media_common, business.industry, Class (biology), Computer Science Applications, Artificial Intelligence (cs.AI), Domain knowledge, Artificial intelligence, business, computer, Software, Natural language processing, Biotechnology
Abstract: In this paper, we propose a novel method named Biomedical Confident Itemsets Explanation (BioCIE), aiming at post-hoc explanation of black-box machine learning models for biomedical text classification. Using sources of domain knowledge and a confident itemset mining method, BioCIE discretizes the decision space of a black-box into smaller subspaces and extracts semantic relationships between the input text and class labels in different subspaces. Confident itemsets discover how biomedical concepts are related to class labels in the black-box's decision space. BioCIE uses the itemsets to approximate the black-box's behavior for individual predictions. Optimizing fidelity, interpretability, and coverage measures, BioCIE produces class-wise explanations that represent decision boundaries of the black-box. Results of evaluations on various biomedical text classification tasks and black-box models demonstrated that BioCIE can outperform perturbation-based and decision set methods in terms of producing concise, accurate, and interpretable explanations. BioCIE improved the fidelity of instance-wise and class-wise explanations by 11.6% and 7.5%, respectively. It also improved the interpretability of explanations by 8%. BioCIE can be effectively used to explain how a black-box biomedical text classification model semantically relates input texts to class labels. The source code and supplementary material are available at https://github.com/mmoradi-iut/BioCIE .
Published: 2021
Full Text: View/download PDF

41. Ontology-Based Data Integration For Biomedical Research.

Author: Vipul Kashyap, Kei-Hoi Cheung, Donald Doherty, Matthias Samwald, M. Scott Marshall, Joanne Luciano, Susie Stephens, Ivan Herman, and Raymond Hookway
Published: 2007
Full Text: View/download PDF

42. Bringing Neuroscience to the Semantic Web: The Semantic Synapse Project.

Author: Matthias Samwald and Klaus-Peter Adlassnig
Published: 2005
Full Text: View/download PDF

43. Dataset debt in biomedical language modeling

Author: Jason Fries, Natasha Seelam, Gabriel Altay, Leon Weber, Myungsun Kang, Debajyoti Datta, Ruisi Su, Samuele Garda, Bo Wang, Simon Ott, Matthias Samwald, and Wojciech Kusa
Subjects: Cancer Research
Abstract: Large-scale language modeling and natural language prompting have demonstrated exciting capabilities for few and zero shot learning in NLP. However, translating these successes to specialized domains such as biomedicine remains challenging, due in part to biomedical NLP's significant dataset debt - the technical costs associated with data that are not consistently documented or easily incorporated into popular machine learning frameworks at scale. To assess this debt, we crowdsourced curation of datasheets for 167 biomedical datasets. We find that only 13% of datasets are available via programmatic access and 30% lack any documentation on licensing and permitted reuse. Our dataset catalog is available at: https://tinyurl.com/bigbio22.
Published: 2022

44. Benchmark datasets driving artificial intelligence development fail to capture the needs of medical professionals

Author: Kathrin Blagec, Jakob Kraiger, Wolfgang Frühwirt, and Matthias Samwald
Subjects: FOS: Computer and information sciences, Artificial Intelligence (cs.AI), ComputingMethodologies_PATTERNRECOGNITION, Computer Science - Artificial Intelligence, Health Informatics, Computer Science Applications
Abstract: Publicly accessible benchmarks that allow for assessing and comparing model performances are important drivers of progress in artificial intelligence (AI). While recent advances in AI capabilities hold the potential to transform medical practice by assisting and augmenting the cognitive processes of healthcare professionals, the coverage of clinically relevant tasks by AI benchmarks is largely unclear. Furthermore, there is a lack of systematized meta-information that allows clinical AI researchers to quickly determine accessibility, scope, content and other characteristics of datasets and benchmark datasets relevant to the clinical domain. To address these issues, we curated and released a comprehensive catalogue of datasets and benchmarks pertaining to the broad domain of clinical and biomedical natural language processing (NLP), based on a systematic review of literature and online resources. A total of 450 NLP datasets were manually systematized and annotated with rich metadata, such as targeted tasks, clinical applicability, data types, performance metrics, accessibility and licensing information, and availability of data splits. We then compared tasks covered by AI benchmark datasets with relevant tasks that medical practitioners reported as highly desirable targets for automation in a previous empirical study. Our analysis indicates that AI benchmarks of direct clinical relevance are scarce and fail to cover most work activities that clinicians want to see addressed. In particular, tasks associated with routine documentation and patient data administration workflows are not represented despite significant associated workloads. Thus, currently available AI benchmarks are improperly aligned with desired targets for AI automation in clinical settings, and novel benchmarks should be created to fill these gaps., (this version extends the literature references)
Published: 2022

45. Improving the Robustness and Accuracy of Intelligent Clinical Text Processing: Producing Noise for Data Augmentation

Author: Milad Moradi, Kathrin Blagec, and Matthias Samwald
Subjects: History, Polymers and Plastics, Business and International Management, Industrial and Manufacturing Engineering
Published: 2022
Full Text: View/download PDF

46. LinkExplorer: Predicting, explaining and exploring links in large biomedical knowledge graphs

Author: Simon Ott, Adriano Barbosa-Silva, and Matthias Samwald
Subjects: Statistics and Probability, Computational Mathematics, Computational Theory and Mathematics, Molecular Biology, Biochemistry, Computer Science Applications
Abstract: SummaryMachine learning algorithms for link prediction can be valuable tools for hypothesis generation. However, many current algorithms are black boxes or lack good user interfaces that could facilitate insight into why predictions are made. We present LinkExplorer, a software suite for predicting, explaining and exploring links in large biomedical knowledge graphs. LinkExplorer integrates our novel, rule-based link prediction engine SAFRAN, which was recently shown to outcompete other explainable algorithms and established black box algorithms. Here, we demonstrate highly competitive evaluation results of our algorithm on multiple large biomedical knowledge graphs, and release a web interface that allows for interactive and intuitive exploration of predicted links and their explanations.Availability and ImplementationA publicly hosted instance, source code and further documentation can be found athttps://github.com/OpenBioLink/Explorer.Contactmatthias.samwald@meduniwien.ac.atSupplementary informationSupplementary data are available atBioinformaticsonline.
Published: 2021

47. Relational local electroencephalography representations for sleep scoring

Author: Georg Brandmayr, Manfred Hartmann, Franz Fürbass, Gerald Matz, Matthias Samwald, Tilmann Kluge, and Georg Dorffner
Subjects: Artificial Intelligence, Cognitive Neuroscience, Polysomnography, Sleep, REM, Electroencephalography, Sleep Stages, Sleep
Abstract: Computational sleep scoring from multimodal neurophysiological time-series (polysomnography PSG) has achieved impressive clinical success. Models that use only a single electroencephalographic (EEG) channel from PSG have not yet received the same clinical recognition, since they lack Rapid Eye Movement (REM) scoring quality. The question whether this lack can be remedied at all remains an important one. We conjecture that predominant Long Short-Term Memory (LSTM) models do not adequately represent distant REM EEG segments (termed epochs), since LSTMs compress these to a fixed-size vector from separate past and future sequences. To this end, we introduce the EEG representation model ENGELBERT (electroEncephaloGraphic Epoch Local Bidirectional Encoder Representations from Transformer). It jointly attends to multiple EEG epochs from both past and future. Compared to typical token sequences in language, for which attention models have originally been conceived, overnight EEG sequences easily span more than 1000 30 s epochs. Local attention on overlapping windows reduces the critical quadratic computational complexity to linear, enabling versatile sub-one-hour to all-day scoring. ENGELBERT is at least one order of magnitude smaller than established LSTM models and is easy to train from scratch in a single phase. It surpassed state-of-the-art macro F1-scores in 3 single-EEG sleep scoring experiments. REM F1-scores were pushed to at least 86%. ENGELBERT virtually closed the gap to PSG-based methods from 4-5 percentage points (pp) to less than 1 pp F1-score.
Published: 2021

48. Convolutional Neural Networks for Fully Automated Diagnosis of Cardiac Amyloidosis by Cardiac Magnetic Resonance Imaging

Author: Renate Kain, Asan Agibetov, Diana Bonderman, Franz Duca, Julia Mascherbauer, Matthias Koschutnik, Andreas A. Kammerlander, Hermine Agis, Lore Schrutka, Georg Dorffner, Christian Nitsche, Theresa-Marie Dachs, Christian Hengstenberg, Johannes Kastner, René Rettl, Michaela Auer-Grumbach, Matthias Samwald, Alessa Stria, Carolina Donà, and Christina Binder
Subjects: heart failure, cardiac amyloidosis, artificial intelligence, diagnostic ability, medicine.medical_specialty, medicine.diagnostic_test, business.industry, Deep learning, Medicine (miscellaneous), medicine.disease, Convolutional neural network, Article, Low volume, Cardiac amyloidosis, Fully automated, Cardiac magnetic resonance imaging, Heart failure, medicine, Medicine, Radiology, Artificial intelligence, business, Cardiac magnetic resonance
Abstract: Aims: We tested the hypothesis that artificial intelligence (AI)-powered algorithms applied to cardiac magnetic resonance (CMR) images could be able to detect the potential patterns of cardiac amyloidosis (CA). Readers in CMR centers with a low volume of referrals for the detection of myocardial storage diseases or a low volume of CMRs, in general, may overlook CA. In light of the growing prevalence of the disease and emerging therapeutic options, there is an urgent need to avoid misdiagnoses. Methods and Results: Using CMR data from 502 patients (CA: n = 82), we trained convolutional neural networks (CNNs) to automatically diagnose patients with CA. We compared the diagnostic accuracy of different state-of-the-art deep learning techniques on common CMR imaging protocols in detecting imaging patterns associated with CA. As a result of a 10-fold cross-validated evaluation, the best-performing fine-tuned CNN achieved an average ROC AUC score of 0.96, resulting in a diagnostic accuracy of 94% sensitivity and 90% specificity. Conclusions: Applying AI to CMR to diagnose CA may set a remarkable milestone in an attempt to establish a fully computational diagnostic path for the diagnosis of CA, in order to support the complex diagnostic work-up requiring a profound knowledge of experts from different disciplines.
Published: 2021

49. Provenance of Microarray Experiments for a Better Understanding of Experiment Results.

Author: Helena F. Deus, Jun Zhao 0003, Satya S. Sahoo, Matthias Samwald, Eric Prud'hommeaux, Michael Miller 0001, M. Scott Marshall, and Kei-Hoi Cheung
Published: 2010

50. Linking Open Drug Data.

Author: Anja Jentzsch, Jun Zhao 0003, Oktie Hassanzadeh, Kei-Hoi Cheung, Matthias Samwald, and Bo Andersson
Published: 2009

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

127 results on '"Matthias Samwald"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources