Author: "(0000-0002-4974-230X) Steinbach, P." - Searchworks@Jio Institute Digital Library Search Results

1. mlphys101 - Exploring the performance of Large-Language Models in multilingual undergraduate physics education

Author: Völschow, M., Buczek, P., Carreno-Mosquera, P., Mousavias, C., Reganova, S., Roldan-Rodriguez, E., (0000-0002-4974-230X) Steinbach, P., Strube, A., Völschow, M., Buczek, P., Carreno-Mosquera, P., Mousavias, C., Reganova, S., Roldan-Rodriguez, E., (0000-0002-4974-230X) Steinbach, P., and Strube, A.
Abstract: Large-Language Models such as ChatGPT have the potential to revo- lutionize academic teaching in physics in a similar way the electronic calculator, the home computer or the internet did. AI models are patient, produce answers tailored to a student’s needs and are accessible whenever needed. Those involved in academic teaching are facing a number of questions: Just how reliable are pub- licly accessible models in answering, how does the question’s language affect the models’ performance and how well do the models perform with more difficult tasks beyond retrieval? To adress these questions, we benchmark a number of publicly available models on the mlphys101 dataset, a new set of 823 university level MC5 questions and answers released alongside this work. While the original questions are in English, we employ GPT-4 to translate them into various other languages, followed by revision and refinement by native speakers. Our findings indicate that state-of-the-art models perform well on questions involving the replication of facts, definitions, and basic concepts, but struggle with multi-step quantitative reason- ing. This aligns with existing literature that highlights the challenges LLMs face in mathematical and logical reasoning tasks. We conclude that the most advanced current LLMs are a valuable addition to the academic curriculum and LLM pow- ered translations are a viable method to increase the accessibility of materials, but their utility for more difficult quantitative tasks remains limited. The dataset is available in English here only and will be removed, once the mlphys101 publication was accepted and released to the public.
Published: 2024

2. A didactical dataset to learn supervised classification with candy

Author: Sinichenko, V. A., Bähr, M., Maximilian, M., Philip, N., Gabriele, N., Ihor, T., Jessica, A., Florian, L. C., Martina, R., Franziska, B., Yashkumar, P. F., Anna, S., Satyam, S. G., Dora, H., Asma, W., Nico, B., Tim, Q., Muhammad, H. K., Benjamin, B., Roland, N., Laura, M., Marius, P., Siddhartha, J., Tom, G., Yaqian, Z., Yan, A., Lena, S., Hamdaan, A. F., Florens, K., Shayan, P., Lukas, P., (0000-0002-4974-230X) Steinbach, P., Sinichenko, V. A., Bähr, M., Maximilian, M., Philip, N., Gabriele, N., Ihor, T., Jessica, A., Florian, L. C., Martina, R., Franziska, B., Yashkumar, P. F., Anna, S., Satyam, S. G., Dora, H., Asma, W., Nico, B., Tim, Q., Muhammad, H. K., Benjamin, B., Roland, N., Laura, M., Marius, P., Siddhartha, J., Tom, G., Yaqian, Z., Yan, A., Lena, S., Hamdaan, A. F., Florens, K., Shayan, P., Lukas, P., and (0000-0002-4974-230X) Steinbach, P.
Abstract: A didactical dataset to learn supervised classification It was obtained from university level students measuring candy that was mixed and distributed in bowls to them. The goal of this dataset creation was to expose the students to the data taking process. Further, the dataset is meant for classificatio
Published: 2024

3. Reproducibility in Data Science and Machine Learning: How far should we go to enable products?

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: Machine Learning is becoming ubiquitous in many scientific domains. However, practitioners struggle to apply every new addition to the Machine Learning market on their data with comparable effects than published. In this talk, I'd like to present recent observations on reproducibility of Machine Learning results and how the community strives to tackle related challenges.
Published: 2023

4. Reproducibility in Data Science and Machine Learning: How far should we go to enable products?

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: Machine Learning is becoming ubiquitous in many scientific domains. However, practitioners struggle to apply every new addition to the Machine Learning market on their data with comparable effects than published. In this talk, I'd like to present recent observations on reproducibility of Machine Learning results and how the community strives to tackle related challenges.
Published: 2023

5. Rollenscan Data Science und KI

Author: Ernst, M., Hartmann, M., Marx, S., Schindler, T., (0000-0002-4974-230X) Steinbach, P., Wilde, A., Ernst, M., Hartmann, M., Marx, S., Schindler, T., (0000-0002-4974-230X) Steinbach, P., and Wilde, A.
Abstract: Im Zeitalter der datengestützten Entscheidungsfindung hat sich der Bereich Datenwissenschaft (Data Science) zu einem wichtigen Katalysator für Innovation und Fortschritt sowohl in der Industrie als auch in der Wissenschaft entwickelt. Die Rollen und Aufgaben von Data Scientists haben sich jedoch erheblich weiterentwickelt und umfassen ein breites Spektrum an Fähigkeiten, Fachwissen und Anwendungen. Um die Vielschichtigkeit dieser Rollen zu erfassen und unser kollektives Verständnis zu vermitteln, traf sich eine Gruppe von sechs Fachleuten bei Silicon Saxony und nutzte "Personas" als Methode, um unsere derzeitigen Ansichten über die Rolle von Datenwissenschaftler:innen zu formulieren. Dieses Papier fasst die Erkenntnisse dieser Aktivität zusammen.
Published: 2023

6. Providing AI expertise as an infrastructure in academia

Author: Piraud, M., Camero, A., Götz, M., Kesselheim, S., (0000-0002-4974-230X) Steinbach, P., Weigel, T., Piraud, M., Camero, A., Götz, M., Kesselheim, S., (0000-0002-4974-230X) Steinbach, P., and Weigel, T.
Abstract: Artificial intelligence (AI) is proliferating and developing faster than any domain scientist can adapt. To support the scientific enterprise in the Helmholtz association, a network of AI specialists has been set up to disseminate AI expertise as an infrastructure among domain scientists. As this effort exposes an evolutionary step in science organization in Germany, this article aspires to describe our setup, goals, and motivations. We comment on past experiences, current developments, and future ideas as we bring our expertise as an infrastructure closer to scientists across our organization. We hope that this offers a brief yet insightful view of our activities as well as inspiration for other science organizations.
Published: 2023

7. Lossless and Lossy Compression for Photon Science

Author: Felicita, G., (0000-0002-4974-230X) Steinbach, P., Felicita, G., and (0000-0002-4974-230X) Steinbach, P.
Abstract: High bandwidth instruments (data production rates of GB/s) have proliferated in photon science experimental facilities in the last years across the globe. Some of them are planned to be operated 24/7. Data volumes thus produced exceed both the budget of storage facilities and sometimes even the ingest capacities of hardware. In this talk, I'd like to highlight key challenges when considering both lossless and lossy compression in photon science. I will highlight data science approaches to characterize or preprocess data. The talk will also showcase advances in finding optimal encoding parameters to achieve high data ingest bandwidths at high compression ratios. In addition, I'd like to introduce challenges for lossy compression with respect to good scientific practice and our advances to mitigate them without regressing to data quality metrics. The presentation was given at the 2023 European HDF User Group (HUG) plugins and data compression summit. For more information on the event, see https://indico.desy.de/event/39343
Published: 2023

8. Rollenscan Data Science und KI

Author: Ernst, M., Hartmann, M., Marx, S., Schindler, T., (0000-0002-4974-230X) Steinbach, P., Wilde, A., Ernst, M., Hartmann, M., Marx, S., Schindler, T., (0000-0002-4974-230X) Steinbach, P., and Wilde, A.
Abstract: Im Zeitalter der datengestützten Entscheidungsfindung hat sich der Bereich Datenwissenschaft (Data Science) zu einem wichtigen Katalysator für Innovation und Fortschritt sowohl in der Industrie als auch in der Wissenschaft entwickelt. Die Rollen und Aufgaben von Data Scientists haben sich jedoch erheblich weiterentwickelt und umfassen ein breites Spektrum an Fähigkeiten, Fachwissen und Anwendungen. Um die Vielschichtigkeit dieser Rollen zu erfassen und unser kollektives Verständnis zu vermitteln, traf sich eine Gruppe von sechs Fachleuten bei Silicon Saxony und nutzte "Personas" als Methode, um unsere derzeitigen Ansichten über die Rolle von Datenwissenschaftler:innen zu formulieren. Dieses Papier fasst die Erkenntnisse dieser Aktivität zusammen.
Published: 2023

9. Innovationstreiber KI - KI-Vernetzungsveranstaltung Helmholtz-Zentrum Dresden-Rossendorf 25. November 2022 - Bericht zur Veranstaltung

Author: (0000-0002-9935-4428) Juckeland, G., (0000-0002-4974-230X) Steinbach, P., Giebel, M., Kiele, S., (0000-0001-8167-9411) Konrad, U., (0000-0002-9935-4428) Juckeland, G., (0000-0002-4974-230X) Steinbach, P., Giebel, M., Kiele, S., and (0000-0001-8167-9411) Konrad, U.
Abstract: Eines der wichtigsten Anliegen der KI-Strategie ist es den Wissens- und Technologietransfer zu fördern. Unser Ziel ist es dabei, den Transfer von Forschungsergebnissen in die Wirtschaft und in die Gesellschaft weiter zu verbessern. Dafür möchten wir uns gerne mit Ihnen im Rahmen der Veranstaltung austauschen und über Lösungsansätze diskutieren. Nach einem Grußwort des Vorstands des Helmholtz-Zentrum Dresden-Rossendorf, Herrn Prof. Dr. Sebastian M. Schmidt, und Herrn Staatsminister Sebastian Gemkow haben wir für Sie verschiedene Impulsvorträge von Seiten der Wissenschaft, Wirtschaft und Verwaltung vorbereitet. Insbesondere die Helmholtz-Gemeinschaft als Gastgeberin wird sich mit ihren Aktivitäten vorstellen. Im Rahmen von Themenräumen haben Sie dann die Möglichkeit, sich intensiv in Kleingruppen über die verschiedenen Aspekte und Themen der Nutzung von KI auszutauschen. Am Ende der Veranstaltung soll dann ein Ergebnispapier mit den wesentlichen Erkenntnissen stehen. Die zentralen Fragestellungen sind dabei, wie aus den wissenschaftlichen Erkenntnissen konkrete Anwendungen bzw. Produkte entstehen können und wie man als Unternehmen die Erkenntnisse nutzbar machen kann. In den Themenräumen sollen sich die Teilnehmer dann zu den verschiedenen Fragestellungen des Wissenstransfers, der Kooperation und Umsetzung austauschen. Denn trotz unterschiedlicher Geschäftsfelder und Branchen gibt es bei den grundlegenden Methoden und Herangehensweisen Überschneidungen.
Published: 2023

10. c2st: Classifier Two-Sample Testing for comparing high-dimensional point sets

Author: (0000-0003-1354-0578) Schmerler, S., (0000-0002-4974-230X) Steinbach, P., (0000-0003-1354-0578) Schmerler, S., and (0000-0002-4974-230X) Steinbach, P.
Abstract: Test whether two sets of points are samples from the same D-dimensional probability distribution without having access to the PDF.
Published: 2022

11. Uncertainty quantification in machine learning applications

Author: (0000-0003-1354-0578) Schmerler, S., Starke, S., (0000-0002-4974-230X) Steinbach, P., M. K. Siddiqui, Q., (0000-0002-8311-0613) Fiedler, L., (0000-0001-9162-262X) Cangi, A., Kulkarni, S. H., (0000-0003-1354-0578) Schmerler, S., Starke, S., (0000-0002-4974-230X) Steinbach, P., M. K. Siddiqui, Q., (0000-0002-8311-0613) Fiedler, L., (0000-0001-9162-262X) Cangi, A., and Kulkarni, S. H.
Abstract: We strive to popularize the usage of uncertainty quantification methods in machine learning through publications and application in various projects covering diverse fields from regression and classification to instance segmentation. In addition, we employ domain shift detection techniques to tackle population-level out-of-distribution scenarios. In all cases, the goal is to assess model prediction validity given unseen test data.
Published: 2022

12. Simulation-Based Inference for Beam Parameter Inversion

Author: (0000-0002-4974-230X) Steinbach, P., Hartmann, G., (0000-0002-4974-230X) Steinbach, P., and Hartmann, G.
Abstract: In this talk, I'd like to present modern machine learning tools for estimating the posterior of the inverse problem exposed in a beam control setting. That is, given an experimental beam profile, I'd like to demonstrate tools that help to estimate which simulation parameters might have produced a similar beam profile with high likelihood. We summarize preliminary findings bound to optimize a xray beamline located at a synchrotron accelerator. With this, we hope to tackle the challenge to characterize beam quality with minimal invasion as possible. The basis of my discussion will be a surrogate model that emulates experimental conditions of beam profile knife-edge scans. We hope that this discussion is of interest to this accelerator physics community at LPA.
Published: 2022

13. Helmholtz AI: Diversity in Teaching Machine Learning supports democratising AI

Author: Cea, D., Hoffmann, H., Weiel, M., (0000-0002-4974-230X) Steinbach, P., Kesselheim, S., Cea, D., Hoffmann, H., Weiel, M., (0000-0002-4974-230X) Steinbach, P., and Kesselheim, S.
Abstract: “Democratising AI” – that is the motto for the Helmholtz AI consultants. With our scientific consulting, we enable Helmholtz researchers from all domains to leverage AI for their datasets by providing comprehensive support with AI methods, tools, and software engineering. And this does not only apply to scientists working on their own research projects. We also offer courses, workshops, lectures, and challenges on various AI-related topics. On our poster, you can find an overview of past teaching experiences from the different consultant teams. These include an in-depth introductory course to deep learning using the flipped classroom approach, advanced courses on AutoML and explainable AI with multiple hands-on sessions, data challenges introducing the learners to domain adaptation tasks and making them experiment and search for personal solutions to complex and current problems, and crash courses on AI for a broader and less technical audience. You are curious to learn more? Then drop by our poster and let’s have a chat.
Published: 2022

14. A Hybrid Radiomics Approach to Modeling Progression-free Survival in Head and Neck Cancers

Author: Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., (0000-0002-4917-2458) Piraud, M., Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., and (0000-0002-4917-2458) Piraud, M.
Abstract: We present our contribution to the HECKTOR 2021 challenge. We created a Survival Random Forest model based on clinical features, and a few radiomics features that have been extracted with and without using the given tumor masks, for Task 3 and Task 2 of the challenge, respectively. To decide on which radiomics features to include into the model, we proceeded both to automatic feature selection, using several established methods, and to literature review of radiomics approaches for similar tasks. Our best performing model includes one feature selected from the literature (Metabolic Tumor Volume derived from the FDG-PET image), one via stability selection (Inverse Variance of the Gray Level Co-occurrence Matrix of the CT image), and one selected via permutation-based feature importance (Tumor Sphericity). This hybrid approach turns-out to be more robust to overfitting than models based on automatic feature selection. We also show that simple ROI definition for the radiomics features, derived by thresholding the Standard Uptake Value in the FDG-PET images, outperforms the given expert tumor delineation in our case. Team name on the AIcrowd platform: ia-h-ai.
Published: 2022

15. c2st: Classifier Two-Sample Testing for comparing high-dimensional point sets

Author: (0000-0003-1354-0578) Schmerler, S., (0000-0002-4974-230X) Steinbach, P., (0000-0003-1354-0578) Schmerler, S., and (0000-0002-4974-230X) Steinbach, P.
Abstract: Test whether two sets of points are samples from the same D-dimensional probability distribution without having access to the PDF.
Published: 2022

16. Equivariant neural networks for image segmentation

Author: (0000-0003-1153-2697) Venkatesh, D. K., (0000-0001-8679-5905) Lokamani, M., (0000-0002-9935-4428) Juckeland, G., Weigert, M., (0000-0002-4974-230X) Steinbach, P., (0000-0003-1153-2697) Venkatesh, D. K., (0000-0001-8679-5905) Lokamani, M., (0000-0002-9935-4428) Juckeland, G., Weigert, M., and (0000-0002-4974-230X) Steinbach, P.
Abstract: Deep neural networks have by today been established as the goto candidate for semantic or instance segmentation at many scales and image modalities. The pressing challenge in supervised segmentation approaches remains to be the requirement of large annotated image datasets for good performance. In recent years the expressive capabilities of neural networks have been demonstrated to improve through group convolutional operations which exploit existing symmetries present in the data. The increased capacity for weight-sharing alongside gains in sample efficiency for training a neural network have led to the empirical success of equivariant neural networks. In our study, we propose and experiment on an equivariant U-net-based model for the task of image segmentation. In this talk, we will discuss our preliminary results on a synthetic datasets consisting of polygonal objects. The results indicate that the performance of our implementation of an equivariant network improves well beyond a vanilla Unet when exposed to symmetrical objects in data different scenarios. References: 1. Taco S. Cohen, Max Welling, “Group Equivariant convolution networks”, arXiv preprint arXiv: 1602.07576, 2016. 2. Maurice Weiler and Gabriele Cesa, ”General E(2)-Equivariant Steerable CNNs”, NeurIPS 2019.
Published: 2022

17. Recommendations on compiling test datasets for evaluating artificial intelligence solutions in pathology

Author: Homeyer, A., Geißler, C., Ole Schwen, L., Zakrzewski, F., Evans, T., Strohmenger, K., Westphal, M., David Bülow, R., Kargl, M., Karjauv, A., Munné-Bertran, I., Orge Retzlaff, C., Romero-López, A., Sołtysiński, T., Plass, M., Carvalho, R., (0000-0002-4974-230X) Steinbach, P., Lan, Y.-C., Bouteldja, N., Haber, D., Rojas-Carulla, M., Vafaei Sadr, A., Kraft, M., Krüger, D., Fick, R., Lang, T., Boor, P., Müller, H., Hufnagl, P., Zerbe, N., Homeyer, A., Geißler, C., Ole Schwen, L., Zakrzewski, F., Evans, T., Strohmenger, K., Westphal, M., David Bülow, R., Kargl, M., Karjauv, A., Munné-Bertran, I., Orge Retzlaff, C., Romero-López, A., Sołtysiński, T., Plass, M., Carvalho, R., (0000-0002-4974-230X) Steinbach, P., Lan, Y.-C., Bouteldja, N., Haber, D., Rojas-Carulla, M., Vafaei Sadr, A., Kraft, M., Krüger, D., Fick, R., Lang, T., Boor, P., Müller, H., Hufnagl, P., and Zerbe, N.
Abstract: Artificial intelligence (AI) solutions that automatically extract information from digital histology images have shown great promise for improving pathological diagnosis. Prior to routine use, it is important to evaluate their predictive performance and obtain regulatory approval. This assessment requires appropriate test datasets. However, compiling such datasets is challenging and specific recommendations are missing. A committee of various stakeholders, including commercial AI developers, pathologists, and researchers, discussed key aspects and conducted extensive literature reviews on test datasets in pathology. Here, we summarize the results and derive general recommendations on compiling test datasets. We address several questions: Which and how many images are needed? How to deal with low-prevalence subsets? How can potential bias be detected? How should datasets be reported? What are the regulatory requirements in different countries? The recommendations are intended to help AI developers demonstrate the utility of their products and to help pathologists and regulatory agencies verify reported performance measures. Further research is needed to formulate criteria for sufficiently representative test datasets so that AI solutions can operate with less user intervention and better support diagnostic workflows in the future.
Published: 2022

18. A Hybrid Radiomics Approach to Modeling Progression-free Survival in Head and Neck Cancers

Author: Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., (0000-0002-4917-2458) Piraud, M., Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., and (0000-0002-4917-2458) Piraud, M.
Abstract: We present our contribution to the HECKTOR 2021 challenge. We created a Survival Random Forest model based on clinical features, and a few radiomics features that have been extracted with and without using the given tumor masks, for Task 3 and Task 2 of the challenge, respectively. To decide on which radiomics features to include into the model, we proceeded both to automatic feature selection, using several established methods, and to literature review of radiomics approaches for similar tasks. Our best performing model includes one feature selected from the literature (Metabolic Tumor Volume derived from the FDG-PET image), one via stability selection (Inverse Variance of the Gray Level Co-occurrence Matrix of the CT image), and one selected via permutation-based feature importance (Tumor Sphericity). This hybrid approach turns-out to be more robust to overfitting than models based on automatic feature selection. We also show that simple ROI definition for the radiomics features, derived by thresholding the Standard Uptake Value in the FDG-PET images, outperforms the given expert tumor delineation in our case. Team name on the AIcrowd platform: ia-h-ai.
Published: 2022

19. Uncertainty quantification in machine learning applications

Author: (0000-0003-1354-0578) Schmerler, S., (0000-0001-5007-1868) Starke, S., (0000-0002-4974-230X) Steinbach, P., M. K. Siddiqui, Q., (0000-0002-8311-0613) Fiedler, L., (0000-0001-9162-262X) Cangi, A., Kulkarni, S. H., (0000-0003-1354-0578) Schmerler, S., (0000-0001-5007-1868) Starke, S., (0000-0002-4974-230X) Steinbach, P., M. K. Siddiqui, Q., (0000-0002-8311-0613) Fiedler, L., (0000-0001-9162-262X) Cangi, A., and Kulkarni, S. H.
Abstract: We strive to popularize the usage of uncertainty quantification methods in machine learning through publications and application in various projects covering diverse fields from regression and classification to instance segmentation. In addition, we employ domain shift detection techniques to tackle population-level out-of-distribution scenarios. In all cases, the goal is to assess model prediction validity given unseen test data.
Published: 2022

20. Simulation-Based Inference for Beam Parameter Inversion

Author: (0000-0002-4974-230X) Steinbach, P., Hartmann, G., (0000-0002-4974-230X) Steinbach, P., and Hartmann, G.
Abstract: In this talk, I'd like to present modern machine learning tools for estimating the posterior of the inverse problem exposed in a beam control setting. That is, given an experimental beam profile, I'd like to demonstrate tools that help to estimate which simulation parameters might have produced a similar beam profile with high likelihood. We summarize preliminary findings bound to optimize a xray beamline located at a synchrotron accelerator. With this, we hope to tackle the challenge to characterize beam quality with minimal invasion as possible. The basis of my discussion will be a surrogate model that emulates experimental conditions of beam profile knife-edge scans. We hope that this discussion is of interest to this accelerator physics community at LPA.
Published: 2022

21. Helmholtz AI: Diversity in Teaching Machine Learning supports democratising AI

Author: Cea, D., Hoffmann, H., Weiel, M., (0000-0002-4974-230X) Steinbach, P., Kesselheim, S., Cea, D., Hoffmann, H., Weiel, M., (0000-0002-4974-230X) Steinbach, P., and Kesselheim, S.
Abstract: “Democratising AI” – that is the motto for the Helmholtz AI consultants. With our scientific consulting, we enable Helmholtz researchers from all domains to leverage AI for their datasets by providing comprehensive support with AI methods, tools, and software engineering. And this does not only apply to scientists working on their own research projects. We also offer courses, workshops, lectures, and challenges on various AI-related topics. On our poster, you can find an overview of past teaching experiences from the different consultant teams. These include an in-depth introductory course to deep learning using the flipped classroom approach, advanced courses on AutoML and explainable AI with multiple hands-on sessions, data challenges introducing the learners to domain adaptation tasks and making them experiment and search for personal solutions to complex and current problems, and crash courses on AI for a broader and less technical audience. You are curious to learn more? Then drop by our poster and let’s have a chat.
Published: 2022

22. Know your learners to adapt teaching early - a discussion of pre-workshop surveys

Author: Hoffmann, H., (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., and (0000-0002-4974-230X) Steinbach, P.
Abstract: During the process of composing teaching material and preparing for a lesson, an instructor typically bears in mind an expected composition of his/her audience with respect to prior knowledge, speed of learning, and expectations. In this discussion, we would like to present our approaches to assessing these 3 dimensions before a workshop starts in order to prepare the teacher for his/her students. We will share some typical questions we survey with and explain how it helps us to target our content better. Moreover, it can provide a solid baseline for teachers with respect to preparing group activities to foster a community of learners in the classroom. The largest part of the discussion will be devoted to an exchange of experiences by the participants on assessing learner communities before they meet in class and on how to improve for future workshops.
Published: 2022

23. Machine Learning Collaboration-as-a-service at Helmholtz

Author: (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., Tanveer, M., (0000-0003-1354-0578) Schmerler, S., (0000-0001-5007-1868) Starke, S., (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., Tanveer, M., (0000-0003-1354-0578) Schmerler, S., and (0000-0001-5007-1868) Starke, S.
Abstract: Machine Learning (ML) based methods are proliferating in industry, society, science and physics in particular in the last years. Not only do ML tools allow inferences from experimental data, but also have shown to be an inroads to previously unreachable theoretical or experimental domains. In this presentation, I'll introduce Helmholtz AI as a funded networking activity within Helmholtz to support matter research in using ML for science. I'll dive into how and why the program is split into research and collaboration-as-a-service staff. If time allows, I'll discuss how we approach ML projects with scientists from a consulting point of view.
Published: 2022

24. Equivariant neural networks for image segmentation

Author: (0000-0003-1153-2697) Venkatesh, D. K., (0000-0001-8679-5905) Lokamani, M., (0000-0002-9935-4428) Juckeland, G., Weigert, M., (0000-0002-4974-230X) Steinbach, P., (0000-0003-1153-2697) Venkatesh, D. K., (0000-0001-8679-5905) Lokamani, M., (0000-0002-9935-4428) Juckeland, G., Weigert, M., and (0000-0002-4974-230X) Steinbach, P.
Abstract: Deep neural networks have by today been established as the goto candidate for semantic or instance segmentation at many scales and image modalities. The pressing challenge in supervised segmentation approaches remains to be the requirement of large annotated image datasets for good performance. In recent years the expressive capabilities of neural networks have been demonstrated to improve through group convolutional operations which exploit existing symmetries present in the data. The increased capacity for weight-sharing alongside gains in sample efficiency for training a neural network have led to the empirical success of equivariant neural networks. In our study, we propose and experiment on an equivariant U-net-based model for the task of image segmentation. In this talk, we will discuss our preliminary results on a synthetic datasets consisting of polygonal objects. The results indicate that the performance of our implementation of an equivariant network improves well beyond a vanilla Unet when exposed to symmetrical objects in data different scenarios. References: 1. Taco S. Cohen, Max Welling, “Group Equivariant convolution networks”, arXiv preprint arXiv: 1602.07576, 2016. 2. Maurice Weiler and Gabriele Cesa, ”General E(2)-Equivariant Steerable CNNs”, NeurIPS 2019.
Published: 2022

25. First experiences with custom pipelines for compression

Author: Gernhardt, F. P. D., (0000-0002-4974-230X) Steinbach, P., Gernhardt, F. P. D., and (0000-0002-4974-230X) Steinbach, P.
Abstract: This presentation gives an overview of our recent activities in the development of custom pipelines for data compression.
Published: 2022

26. Inverting the Beamline

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: In many domains of modern physics, we encounter the situation that generations of scientists have created high precision simulations of the effects under study. Today, these simulations have become essential to the scientific method. However, these (often mechanistic) simulations of high predictive power carry with them a burden of inference. Once a forward process has been simulated, an inversion of a simulation given observed data from experiments is challenging, sometimes even impossible. In this presentation, I'd like to provide an introduction to simulation based inference for inverting a beamline simulation at BESSY in Berlin. In this project, I studied the inversion of a beamline simulation using state-of-the-art machine learning. We will start our journey with normalizing flows, walk by conditional invertible neural networks and finish with Automatic Posterior Transformation for Likelihood-Free Inference. To stay with the metaphor: please bring your mathematical boots, wear a hat of Bayes Law and bring your best compass of statistics - otherwise you likely get lost in about a quarter of the presentation.
Published: 2021

27. Simulation-based Inference of Beamline Characteristics at BESSY

Author: (0000-0002-4974-230X) Steinbach, P., Hartmann, G., (0000-0002-4974-230X) Steinbach, P., and Hartmann, G.
Abstract: Poster presented at Helmholtz AI Conference 2021
Published: 2021

28. A Hybrid Radiomics Approach to Modeling Progression-free Survival in Head and Neck Cancers

Author: Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., (0000-0002-4917-2458) Piraud, M., Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., and (0000-0002-4917-2458) Piraud, M.
Abstract: We present our contribution to the HECKTOR 2021 challenge. We created a Survival Random Forest model based on clinical features, and a few radiomics features that have been extracted with and without using the given tumor masks, for Task 3 and Task 2 of the challenge, respectively. To decide on which radiomics features to include into the model, we proceeded both to automatic feature selection, using several established methods, and to literature review of radiomics approaches for similar tasks. Our best performing model includes one feature selected from the literature (Metabolic Tumor Volume derived from the FDG-PET image), one via stability selection (Inverse Variance of the Gray Level Co-occurrence Matrix of the CT image), and one selected via permutation-based feature importance (Tumor Sphericity). This hybrid approach turns-out to be more robust to overfitting than models based on automatic feature selection. We also show that simple ROI definition for the radiomics features, derived by thresholding the Standard Uptake Value in the FDG-PET images, outperforms the given expert tumor delineation in our case. Team name on the AIcrowd platform: ia-h-ai.
Published: 2021

29. Teaching Machine Learning in 2020

Author: (0000-0002-4974-230X) Steinbach, P., (0000-0002-8960-9642) Seibold, H., Guhr, O., (0000-0002-4974-230X) Steinbach, P., (0000-0002-8960-9642) Seibold, H., and Guhr, O.
Abstract: Faced by the abundant use of machine learning in industry and academia, the effective and efficient teaching of core concepts in this field becomes of high importance. For this, we organized a workshop on teaching methods in the field of machine learning. In this document, we summarize the current standing of the community as by our workshop and their methods. We touch on existing working concepts in machine learning didactics, what methods present initiatives use and cover open teaching resources available to date. With this, we hope to provide a starting point for future collaborations on this central topic given the expanding use of machine learning in science, industry and our daily lives.
Published: 2021

30. Inverting the Beamline

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: In many domains of modern physics, we encounter the situation that generations of scientists have created high precision simulations of the effects under study. Today, these simulations have become essential to the scientific method. However, these (often mechanistic) simulations of high predictive power carry with them a burden of inference. Once a forward process has been simulated, an inversion of a simulation given observed data from experiments is challenging, sometimes even impossible. In this presentation, I'd like to provide an introduction to simulation based inference for inverting a beamline simulation at BESSY in Berlin. In this project, I studied the inversion of a beamline simulation using state-of-the-art machine learning. We will start our journey with normalizing flows, walk by conditional invertible neural networks and finish with Automatic Posterior Transformation for Likelihood-Free Inference. To stay with the metaphor: please bring your mathematical boots, wear a hat of Bayes Law and bring your best compass of statistics - otherwise you likely get lost in about a quarter of the presentation.
Published: 2021

31. Künstliche Intelligenz im Gesundheitswesen

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: Künstliche Intelligenz als Überbegriff steht schon länger im Fokus von Klinikern, Gesundheitsökonomen und medizinischen Wissenschaftlern. Entdeckungen in diesem weiten Feld sind aber nur vereinzelt im klinischen Alltag sichtbar, da insbesondere die diagnostische Entscheidungshoheit dem Menschen zugestanden wird. Doch steht schon heute fest: Der Einsatz von künstlicher Intelligenz wird nach und nach zu einem Wandel in der Medizin führen.
Published: 2021

32. Inverting the Beamline

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: In many domains of modern physics, we encounter the situation that generations of scientists have created high precision simulations of the effects under study. Today, these simulations have become essential to the scientific method. However, these (often mechanistic) simulations of high predictive power carry with them a burden of inference. Once a forward process has been simulated, an inversion of a simulation given observed data from experiments is challenging, sometimes even impossible. In this presentation, I'd like to provide an introduction to simulation based inference for inverting a beamline simulation at BESSY in Berlin. In this project, I studied the inversion of a beamline simulation using state-of-the-art machine learning. We will start our journey with normalizing flows, walk by conditional invertible neural networks and finish with Automatic Posterior Transformation for Likelihood-Free Inference. To stay with the metaphor: please bring your mathematical boots, wear a hat of Bayes Law and bring your best compass of statistics - otherwise you likely get lost in about a quarter of the presentation.
Published: 2021

33. Computational Science at HZDR: Tools, Services, and Consulting to Empower Your Research

Author: (0000-0002-3145-9880) Pape, D., (0000-0001-8679-5905) Lokamani, M., (0000-0001-8174-7795) Knodel, O., (0000-0001-6273-7102) Müller, S., (0000-0002-5590-7473) Huste, T., (0000-0002-4974-230X) Steinbach, P., (0000-0002-9935-4428) Juckeland, G., (0000-0003-3669-4035) Fiedler, M., (0000-0002-3145-9880) Pape, D., (0000-0001-8679-5905) Lokamani, M., (0000-0001-8174-7795) Knodel, O., (0000-0001-6273-7102) Müller, S., (0000-0002-5590-7473) Huste, T., (0000-0002-4974-230X) Steinbach, P., (0000-0002-9935-4428) Juckeland, G., and (0000-0003-3669-4035) Fiedler, M.
Abstract: The Computational Science Department FWCC and its sister departments offer various tools and services to empower scientists at HZDR in their research. This presentation held at the 2021 PhD seminar aims at introducing the working groups DMS and MT-DMA as well as the Helmholtz Incubator Platforms HIFIS and Helmholtz AI, all hosted by FWCC, and the library FWCB. It presents a selection of said services and shows options of contact for receiving help in the topics presented.
Published: 2021

34. Teaching Machine Learning in 2020

Author: (0000-0002-4974-230X) Steinbach, P., (0000-0002-8960-9642) Seibold, H., Guhr, O., (0000-0002-4974-230X) Steinbach, P., (0000-0002-8960-9642) Seibold, H., and Guhr, O.
Abstract: Faced by the abundant use of machine learning in industry and academia, the effective and efficient teaching of core concepts in this field becomes of high importance. For this, we organized a workshop on teaching methods in the field of machine learning. In this document, we summarize the current standing of the community as by our workshop and their methods. We touch on existing working concepts in machine learning didactics, what methods present initiatives use and cover open teaching resources available to date. With this, we hope to provide a starting point for future collaborations on this central topic given the expanding use of machine learning in science, industry and our daily lives.
Published: 2021

35. Simulation-based Inference of Beamline Characteristics at BESSY

Author: (0000-0002-4974-230X) Steinbach, P., Hartmann, G., (0000-0002-4974-230X) Steinbach, P., and Hartmann, G.
Abstract: Poster presented at Helmholtz AI Conference 2021
Published: 2021

36. Helmholtz AI Consulting for matter research at HZDR

Author: (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., (0000-0002-3145-9880) Pape, D., (0000-0003-1354-0578) Schmerler, S., (0000-0001-5007-1868) Starke, S., (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., (0000-0002-3145-9880) Pape, D., (0000-0003-1354-0578) Schmerler, S., and (0000-0001-5007-1868) Starke, S.
Abstract: In this presentation, I'd like to present the current status of Helmholtz AI consultancy for matter research in Helmholtz. I'd provide sneak previews into past and ongoing vouchers we embarked upon for the accelerator physics community and other collaborators. I'll try my best to give some insights on what we use our cluster for and why. Last but not least, I'll discuss challenges we faced along the way and will highlight some future directions if time allows.
Published: 2021

37. robust training, adversarial examples and what it tells us about modern medical ML classifiers

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: Undoubtedly, the advent of deep learning for image classification or pattern recognition has created a ecosystem stir in the medical domain of unprecedented extension. In this talk, I'd like to discuss the question how adversarial examples can help us quantify the quality of a Deep Learning trained classifyer. With this approach, I'd like to underline how observations and methods from commercial applications can or cannot be transferred to medical applications. The slidedeck is meant to motivate a discussion on what we expect machine learning to leverage and how this relates to clinical applications with robustness of solutions in mind.
Published: 2021

38. Two Examples for AI Communities

Author: (0000-0002-9935-4428) Juckeland, G., (0000-0002-4974-230X) Steinbach, P., (0000-0002-9935-4428) Juckeland, G., and (0000-0002-4974-230X) Steinbach, P.
Abstract: This short talk presents two approaches for building AI communities in the Dresden area. First the top down approach of Helmholtz AI, where HZDR is one of six hubs of consultants to assist and train scientists. Second with the Machine Learning Community (MLC) Dresden a bottom up approach of practitioners just sharing experience and information in regular seminars and other asynchronous communication channels.
Published: 2021

39. A Hybrid Radiomics Approach to Modeling Progression-free Survival in Head and Neck Cancers

Author: (0000-0001-5007-1868) Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., (0000-0002-4917-2458) Piraud, M., (0000-0001-5007-1868) Starke, S., Thalmeier, D., (0000-0002-4974-230X) Steinbach, P., and (0000-0002-4917-2458) Piraud, M.
Abstract: We present our contribution to the HECKTOR 2021 challenge. We created a Survival Random Forest model based on clinical features, and a few radiomics features that have been extracted with and without using the given tumor masks, for Task 3 and Task 2 of the challenge, respectively. To decide on which radiomics features to include into the model, we proceeded both to automatic feature selection, using several established methods, and to literature review of radiomics approaches for similar tasks. Our best performing model includes one feature selected from the literature (Metabolic Tumor Volume derived from the FDG-PET image), one via stability selection (Inverse Variance of the Gray Level Co-occurrence Matrix of the CT image), and one selected via permutation-based feature importance (Tumor Sphericity). This hybrid approach turns-out to be more robust to overfitting than models based on automatic feature selection. We also show that simple ROI definition for the radiomics features, derived by thresholding the Standard Uptake Value in the FDG-PET images, outperforms the given expert tumor delineation in our case. Team name on the AIcrowd platform: ia-h-ai.
Published: 2021

40. Machine Learning for Accelerator Physics and Engineering

Author: (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., (0000-0003-1354-0578) Schmerler, S., (0000-0001-5007-1868) Starke, S., (0000-0002-4974-230X) Steinbach, P., Hoffmann, H., (0000-0003-1354-0578) Schmerler, S., and (0000-0001-5007-1868) Starke, S.
Abstract: Helmholtz AI has been available for members of ARD since its inception 2019/20. In this presentation, I'd like to present the current status of Helmholtz AI consultancy for matter research in Helmholtz. I'd provide sneak previews into past and ongoing vouchers we embarked upon for the accelerator physics community. Last but not least, I'll discuss challenges we faced along the way and will highlight some future directions if time allows.
Published: 2021

41. Teaching ML workshop at ECML/PKDD - 2 years in, 3 thoughts out

Author: (0000-0002-4974-230X) Steinbach, P., Kinnaird, K. M., Guhr, O., (0000-0002-4974-230X) Steinbach, P., Kinnaird, K. M., and Guhr, O.
Abstract: As a co-organizer of the Teaching Machine Learning Workshop at ECMLPKDD, I'd like to share our experience in striving to host a mixture of communities to share and discuss advances in teaching of ML to any level of prior knowledge. What was thought of a small event on the European level, has now become an event at international scale. I'll distill essential outcomes that have relevance to Helmholtz. This presentation was delivered at the TEACH conference: https://events.hifis.net/event/164/timetable/#20211207.detailed
Published: 2021

42. Volume 141: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 14 September 2020, Virtual Conference

Author: Bischl, B., Guhr, O., Seibold, H., (0000-0002-4974-230X) Steinbach, P., Bischl, B., Guhr, O., Seibold, H., and (0000-0002-4974-230X) Steinbach, P.
Abstract: Proceedings of Machine Learning Research
Published: 2021

43. WP7 Task 3 Current Activities

Author: (0000-0002-4974-230X) Steinbach, P., Gernhardt, F. P. D., (0000-0002-4974-230X) Steinbach, P., and Gernhardt, F. P. D.
Abstract: Presentation given to report the current status on our activities in the LEAPS-INNOV WP7 task 3. For the agenda, see https://events.hifis.net/event/210
Published: 2021

44. crowdsourced body parameters of workshop attendants at the Helmholtz MT ARD ST3 meeting

Author: (0000-0002-4974-230X) Steinbach, P., (0000-0003-1354-0578) Schmerler, S., (0000-0002-4974-230X) Steinbach, P., and (0000-0003-1354-0578) Schmerler, S.
Abstract: This data set was crowdsourced at the 2021 Helmholtz MT ARD ST3 meeting from attendants of the Machine Learning Tutorial on Sep 30, 2021. For more details on the event, see https://indico.desy.de/event/28823
Published: 2021

45. Predicting the shoe size of workshop participants

Author: (0000-0001-5007-1868) Starke, S., (0000-0003-1354-0578) Schmerler, S., (0000-0002-4974-230X) Steinbach, P., (0000-0001-5007-1868) Starke, S., (0000-0003-1354-0578) Schmerler, S., and (0000-0002-4974-230X) Steinbach, P.
Abstract: A jupyter notebook that can predict the shoe size of a person based on their gender, height and weight. This is a notebook meant for training purposes to show case how public data can be used to train a machine learning predictor. This notebook uses a public crowd-sourced dataset from https://doi.org/10.5281/zenodo.5541145 to conduct the training.
Published: 2021

46. Inverting the Beamline

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: In many domains of modern physics, we encounter the situation that generations of scientists have created high precision simulations of the effects under study. Today, these simulations have become essential to the scientific method. However, these (often mechanistic) simulations of high predictive power carry with them a burden of inference. Once a forward process has been simulated, an inversion of a simulation given observed data from experiments is challenging, sometimes even impossible. In this presentation, I'd like to provide an introduction to simulation based inference for inverting a beamline simulation at BESSY in Berlin. In this project, I studied the inversion of a beamline simulation using state-of-the-art machine learning. We will start our journey with normalizing flows, walk by conditional invertible neural networks and finish with Automatic Posterior Transformation for Likelihood-Free Inference. To stay with the metaphor: please bring your mathematical boots, wear a hat of Bayes Law and bring your best compass of statistics - otherwise you likely get lost in about a quarter of the presentation.
Published: 2021

47. Helmholtz AI For Matter at Helmholtz-Zentrum Dresden-Rossendorf

Author: (0000-0002-4974-230X) Steinbach, P., Hoffmann, N., (0000-0002-4974-230X) Steinbach, P., and Hoffmann, N.
Abstract: This poster summarizes the Helmholtz AI local unit for Matter Research installed at HZDR. The poster was presented at the Helmholtz AI Kick-Off Meeting on March 5th, 2020, in Munich, Germany.
Published: 2020

48. Helmholtz AI Consultants at Helmholtz-Zentrum Dresden-Rossendorf

Author: (0000-0002-4974-230X) Steinbach, P. and (0000-0002-4974-230X) Steinbach, P.
Abstract: This poster summarizes the dedicated Helmholtz AI consultant team installed at HZDR. The poster was presented at the Helmholtz AI Kick-Off Meeting on March 5th, 2020, in Munich, Germany.
Published: 2020

49. Teaching ML in Compact Courses

Author: (0000-0002-1784-2920) Fouilloux, A., (0000-0002-4974-230X) Steinbach, P., (0000-0002-1784-2920) Fouilloux, A., and (0000-0002-4974-230X) Steinbach, P.
Abstract: This talk summarizes the experiences made with teaching Machine Learning within compact events that stretch over several days to a week maximum. Both speakers explain pitfalls they were caught in as well as solutions they found.
Published: 2020

50. Wieviel 'normales' Risiko birgt COVID in sich?

Author: (0000-0001-6745-2819) Spiegelhalter, D., (0000-0002-4974-230X) Steinbach, P., (0000-0001-6745-2819) Spiegelhalter, D., and (0000-0002-4974-230X) Steinbach, P.
Abstract: Eine Übersetzung des Artikels "How much ‘normal’ risk does Covid represent?" von David Spiegelhalter, der am 21. März 2020 auf https://medium.com/wintoncentre/how-much-normal-risk-does-covid-represent-4539118e1196 erschien. Sir David John Spiegelhalter ist britischer Statistiker und Winton Professor für das öffentliche Verständnis von Risiko an der the Universität Cambridge. Er ist Fellow am Churchill College, Cambridge.
Published: 2020

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Database

55 results on '"(0000-0002-4974-230X) Steinbach, P."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources