Author: "Oberski, Daniel L" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Oberski, Daniel L"' showing total 346 results

Start Over Author "Oberski, Daniel L"

346 results on '"Oberski, Daniel L"'

1. PATCH! Psychometrics-AssisTed benCHmarking of Large Language Models: A Case Study of Proficiency in 8th Grade Mathematics

Author: Fang, Qixiang, Oberski, Daniel L., and Nguyen, Dong
Subjects: Computer Science - Computation and Language, Computer Science - Computers and Society
Abstract: Many existing benchmarks of large (multimodal) language models (LLMs) focus on measuring LLMs' academic proficiency, often with also an interest in comparing model performance with human test takers. While these benchmarks have proven key to the development of LLMs, they suffer from several limitations, including questionable measurement quality (e.g., Do they measure what they are supposed to in a reliable way?), lack of quality assessment on the item level (e.g., Are some items more important or difficult than others?) and unclear human population reference (e.g., To whom can the model be compared?). In response to these challenges, we propose leveraging knowledge from psychometrics - a field dedicated to the measurement of latent variables like academic proficiency - into LLM benchmarking. We make three primary contributions. First, we introduce PATCH: a novel framework for {P}sychometrics-{A}ssis{T}ed ben{CH}marking of LLMs. PATCH addresses the aforementioned limitations, presenting a new direction for LLM benchmark research. Second, we implement PATCH by measuring GPT-4 and Gemini-Pro-Vision's proficiency in 8th grade mathematics against 56 human populations. We show that adopting a psychometrics-based approach yields evaluation outcomes that diverge from those based on existing benchmarking practices. Third, we release 4 high-quality datasets to support measuring and comparing LLM proficiency in grade school mathematics and science against human populations.
Published: 2024

2. General-Purpose User Modeling with Behavioral Logs: A Snapchat Case Study

Author: Fang, Qixiang, Zhou, Zhihan, Barbieri, Francesco, Liu, Yozen, Neves, Leonardo, Nguyen, Dong, Oberski, Daniel L., Bos, Maarten W., and Dotsch, Ron
Subjects: Computer Science - Human-Computer Interaction, Computer Science - Information Retrieval
Abstract: Learning general-purpose user representations based on user behavioral logs is an increasingly popular user modeling approach. It benefits from easily available, privacy-friendly yet expressive data, and does not require extensive re-tuning of the upstream user model for different downstream tasks. While this approach has shown promise in search engines and e-commerce applications, its fit for instant messaging platforms, a cornerstone of modern digital communication, remains largely uncharted. We explore this research gap using Snapchat data as a case study. Specifically, we implement a Transformer-based user model with customized training objectives and show that the model can produce high-quality user representations across a broad range of evaluation tasks, among which we introduce three new downstream tasks that concern pivotal topics in user research: user safety, engagement and churn. We also tackle the challenge of efficient extrapolation of long sequences at inference time, by applying a novel positional encoding method., Comment: SIGIR 2024
Published: 2023
Full Text: View/download PDF

3. On Text-based Personality Computing: Challenges and Future Directions

Author: Fang, Qixiang, Giachanou, Anastasia, Bagheri, Ayoub, Boeschoten, Laura, van Kesteren, Erik-Jan, Kamalabad, Mahdi Shafiee, and Oberski, Daniel L
Subjects: Computer Science - Computation and Language, Computer Science - Computers and Society
Abstract: Text-based personality computing (TPC) has gained many research interests in NLP. In this paper, we describe 15 challenges that we consider deserving the attention of the research community. These challenges are organized by the following topics: personality taxonomies, measurement quality, datasets, performance evaluation, modelling choices, as well as ethics and fairness. When addressing each challenge, not only do we combine perspectives from both NLP and social sciences, but also offer concrete suggestions. We hope to inspire more valid and reliable TPC research., Comment: Findings of ACL 2023. Long paper
Published: 2022

4. Evaluating the Construct Validity of Text Embeddings with Application to Survey Questions

Author: Fang, Qixiang, Nguyen, Dong, and Oberski, Daniel L
Subjects: Computer Science - Computers and Society, Computer Science - Computation and Language, Statistics - Applications, Statistics - Methodology
Abstract: Text embedding models from Natural Language Processing can map text data (e.g. words, sentences, documents) to supposedly meaningful numerical representations (a.k.a. text embeddings). While such models are increasingly applied in social science research, one important issue is often not addressed: the extent to which these embeddings are valid representations of constructs relevant for social science research. We therefore propose the use of the classic construct validity framework to evaluate the validity of text embeddings. We show how this framework can be adapted to the opaque and high-dimensional nature of text embeddings, with application to survey questions. We include several popular text embedding methods (e.g. fastText, GloVe, BERT, Sentence-BERT, Universal Sentence Encoder) in our construct validity analyses. We find evidence of convergent and discriminant validity in some cases. We also show that embeddings can be used to predict respondent's answers to completely new survey questions. Furthermore, BERT-based embedding techniques and the Universal Sentence Encoder provide more valid representations of survey questions than do others. Our results thus highlight the necessity to examine the construct validity of text embeddings before deploying them in social science research., Comment: Under review
Published: 2022
Full Text: View/download PDF

5. Clinical use-cases and implementation guidelines for the development of valuable AI

Author: Borja Jiménez, Karina C., Kemmeren, Patrick, van den Heuvel-Ebrink, Marry, de Krijger, Ronald, Grootenhuis, Martha, Partanen, Marita, Graf, Norbert, Wen, Shuping, Leemans, Alexander, Oberski, Daniel L., Schoot, Reineke A., and Merks, Johannes H.M.
Published: 2024
Full Text: View/download PDF

6. Statistical Analysis—Measurement Error

Author: Brakenhoff, Timo B., van Smeden, Maarten, Oberski, Daniel L., Asselbergs, Folkert W., editor, Denaxas, Spiros, editor, Oberski, Daniel L., editor, and Moore, Jason H., editor
Published: 2023
Full Text: View/download PDF

7. Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records

Author: Ferdinands, Gerbrich, Schram, Raoul, de Bruin, Jonathan, Bagheri, Ayoub, Oberski, Daniel L., Tummers, Lars, Teijema, Jelle Jasper, and van de Schoot, Rens
Published: 2023
Full Text: View/download PDF

8. A systematic literature review of time series methods applied to epidemic prediction

Author: Batoure Bamana, Apollinaire, Shafiee Kamalabad, Mahdi, and Oberski, Daniel L.
Published: 2024
Full Text: View/download PDF

9. Digital trace data collection through data donation

Author: Boeschoten, Laura, Ausloos, Jef, Moeller, Judith, Araujo, Theo, and Oberski, Daniel L.
Subjects: Computer Science - Computers and Society, Statistics - Other Statistics
Abstract: A potentially powerful method of social-scientific data collection and investigation has been created by an unexpected institution: the law. Article 15 of the EU's 2018 General Data Protection Regulation (GDPR) mandates that individuals have electronic access to a copy of their personal data, and all major digital platforms now comply with this law by providing users with "data download packages" (DDPs). Through voluntary donation of DDPs, all data collected by public and private entities during the course of citizens' digital life can be obtained and analyzed to answer social-scientific questions - with consent. Thus, consented DDPs open the way for vast new research opportunities. However, while this entirely new method of data collection will undoubtedly gain popularity in the coming years, it also comes with its own questions of representativeness and measurement quality, which are often evaluated systematically by means of an error framework. Therefore, in this paper we provide a blueprint for digital trace data collection using DDPs, and devise a "total error framework" for such projects. Our error framework for digital trace data collection through data donation is intended to facilitate high quality social-scientific investigations using DDPs while critically reflecting its unique methodological challenges and sources of error. In addition, we provide a quality control checklist to guide researchers in leveraging the vast opportunities afforded by this new mode of investigation.
Published: 2020

10. Multimodal Learning for Cardiovascular Risk Prediction using EHR Data

Author: Bagheri, Ayoub, Groenhof, T. Katrien J., Veldhuis, Wouter B., de Jong, Pim A., Asselbergs, Folkert W., and Oberski, Daniel L.
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing, Statistics - Machine Learning
Abstract: Electronic health records (EHRs) contain structured and unstructured data of significant clinical and research value. Various machine learning approaches have been developed to employ information in EHRs for risk prediction. The majority of these attempts, however, focus on structured EHR fields and lose the vast amount of information in the unstructured texts. To exploit the potential information captured in EHRs, in this study we propose a multimodal recurrent neural network model for cardiovascular risk prediction that integrates both medical texts and structured clinical information. The proposed multimodal bidirectional long short-term memory (BiLSTM) model concatenates word embeddings to classical clinical predictors before applying them to a final fully connected neural network. In the experiments, we compare performance of different deep neural network (DNN) architectures including convolutional neural network and long short-term memory in scenarios of using clinical variables and chest X-ray radiology reports. Evaluated on a data set of real world patients with manifest vascular disease or at high-risk for cardiovascular disease, the proposed BiLSTM model demonstrates state-of-the-art performance and outperforms other DNN baseline architectures., Comment: 8 pages, 2 figures
Published: 2020

11. The effect of measurement error on clustering algorithms

Author: Pankowska, Paulina and Oberski, Daniel L.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Clustering consists of a popular set of techniques used to separate data into interesting groups for further analysis. Many data sources on which clustering is performed are well-known to contain random and systematic measurement errors. Such errors may adversely affect clustering. While several techniques have been developed to deal with this problem, little is known about the effectiveness of these solutions. Moreover, no work to-date has examined the effect of systematic errors on clustering solutions. In this paper, we perform a Monte Carlo study to investigate the sensitivity of two common clustering algorithms, GMMs with merging and DBSCAN, to random and systematic error. We find that measurement error is particularly problematic when it is systematic and when it affects all variables in the dataset. For the conditions considered here, we also find that the partition-based GMM with merged components is less sensitive to measurement error than the density-based DBSCAN procedure.
Published: 2020

12. Fair inference on error-prone outcomes

Author: Boeschoten, Laura, van Kesteren, Erik-Jan, Bagheri, Ayoub, and Oberski, Daniel L.
Subjects: Statistics - Machine Learning, Computer Science - Computers and Society, Computer Science - Machine Learning
Abstract: Fair inference in supervised learning is an important and active area of research, yielding a range of useful methods to assess and account for fairness criteria when predicting ground truth targets. As shown in recent work, however, when target labels are error-prone, potential prediction unfairness can arise from measurement error. In this paper, we show that, when an error-prone proxy target is used, existing methods to assess and calibrate fairness criteria do not extend to the true target variable of interest. To remedy this problem, we suggest a framework resulting from the combination of two existing literatures: fair ML methods, such as those found in the counterfactual fairness literature on the one hand, and, on the other, measurement models found in the statistical literature. We discuss these approaches and their connection resulting in our framework. In a healthcare decision problem, we find that using a latent variable model to account for measurement error removes the unfairness detected previously., Comment: Online supplementary code is available at https://dx.doi.org/10.5281/zenodo.3708150
Published: 2020

13. Privacy-Preserving Generalized Linear Models using Distributed Block Coordinate Descent

Author: van Kesteren, Erik-Jan, Sun, Chang, Oberski, Daniel L., Dumontier, Michel, and Ippel, Lianne
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Computer Science - Distributed, Parallel, and Cluster Computing, Statistics - Machine Learning
Abstract: Combining data from varied sources has considerable potential for knowledge discovery: collaborating data parties can mine data in an expanded feature space, allowing them to explore a larger range of scientific questions. However, data sharing among different parties is highly restricted by legal conditions, ethical concerns, and / or data volume. Fueled by these concerns, the fields of cryptography and distributed learning have made great progress towards privacy-preserving and distributed data mining. However, practical implementations have been hampered by the limited scope or computational complexity of these methods. In this paper, we greatly extend the range of analyses available for vertically partitioned data, i.e., data collected by separate parties with different features on the same subjects. To this end, we present a novel approach for privacy-preserving generalized linear models, a fundamental and powerful framework underlying many prediction and classification procedures. We base our method on a distributed block coordinate descent algorithm to obtain parameter estimates, and we develop an extension to compute accurate standard errors without additional communication cost. We critically evaluate the information transfer for semi-honest collaborators and show that our protocol is secure against data reconstruction. Through both simulated and real-world examples we illustrate the functionality of our proposed algorithm. Without leaking information, our method performs as well on vertically partitioned data as existing methods on combined data -- all within mere minutes of computation time. We conclude that our method is a viable approach for vertically partitioned data analysis with a wide range of real-world applications., Comment: Fully reproducible code for all results and images can be found at https://github.com/vankesteren/privacy-preserving-glm, and the software package can be found at https://github.com/vankesteren/privreg
Published: 2019

14. Rank-deficiencies in a reduced information latent variable model

Author: Oberski, Daniel L.
Subjects: Statistics - Methodology, Mathematics - Statistics Theory
Abstract: Latent variable models are well-known to suffer from rank deficiencies, causing problems with convergence and stability. Such problems are compounded in the "reduced-group split-ballot multitrait-multimethod model", which omits a set of moments from the estimation through a planned missing data design. This paper demonstrates the existence of rank deficiencies in this model and give the explicit null space. It also demonstrates that sample size and distance from the rank-deficient point interact in their effects on convergence, causing convergence to improve or worsen depending on both factors simultaneously. Furthermore, it notes that the latent variable correlations in the uncorrelated methods SB-MTMM model remain unaffected by the rank deficiency. I conclude that methodological experiments should be careful to manipulate both distance to known rank-deficiencies and sample size, and report all results, not only the apparently converged ones. Practitioners may consider that, even in the presence of nonconvergence or so-called "inadmissible" estimates, a subset of parameter estimates may still be consistent and stable.
Published: 2019

15. Structural Equation Models as Computation Graphs

Author: van Kesteren, Erik-Jan and Oberski, Daniel L.
Subjects: Statistics - Methodology, Statistics - Computation
Abstract: Structural equation modeling (SEM) is a popular tool in the social and behavioural sciences, where it is being applied to ever more complex data types. The high-dimensional data produced by modern sensors, brain images, or (epi)genetic measurements require variable selection using parameter penalization; experimental models combining disparate data sources benefit from regularization to obtain a stable result; and genomic SEM or network models lead to alternative objective functions. With each proposed extension, researchers currently have to completely reformulate SEM and its optimization algorithm -- a challenging and time-consuming task. In this paper, we consider each SEM as a computation graph, a flexible method of specifying objective functions borrowed from the field of deep learning. When combined with state-of-the-art optimizers, our computation graph approach can extend SEM without the need for bespoke software development. We show that both existing and novel SEM improvements follow naturally from our approach. To demonstrate, we discuss least absolute deviation estimation and penalized regression models. We also introduce spike-and-slab SEM, which may perform better when shrinkage of large factor loadings is not desired. By applying computation graphs to SEM, we hope to greatly accelerate the process of developing SEM techniques, paving the way for new applications. We provide an accompanying R package tensorsem., Comment: R code and package are available online as supplementary material at https://github.com/vankesteren/sem-computationgraphs and https://github.com/vankesteren/tensorsem/tree/computationgraph, respectively
Published: 2019

16. Exploratory Mediation Analysis with Many Potential Mediators

Author: van Kesteren, Erik-Jan and Oberski, Daniel L.
Subjects: Statistics - Methodology
Abstract: Social and behavioral scientists are increasingly employing technologies such as fMRI, smartphones, and gene sequencing, which yield 'high-dimensional' datasets with more columns than rows. There is increasing interest, but little substantive theory, in the role the variables in these data play in known processes. This necessitates exploratory mediation analysis, for which structural equation modeling is the benchmark method. However, this method cannot perform mediation analysis with more variables than observations. One option is to run a series of univariate mediation models, which incorrectly assumes independence of the mediators. Another option is regularization, but the available implementations may lead to high false positive rates. In this paper, we develop a hybrid approach which uses components of both filter and regularization: the 'Coordinate-wise Mediation Filter'. It performs filtering conditional on the other selected mediators. We show through simulation that it improves performance over existing methods. Finally, we provide an empirical example, showing how our method may be used for epigenetic research., Comment: R code and package are available online as supplementary material at https://github.com/vankesteren/cmfilter and https://github.com/vankesteren/ema_simulations
Published: 2018
Full Text: View/download PDF

17. The Expected Parameter Change (EPC) for Local Dependence Assessment in Binary Data Latent Class Models

Author: Oberski, Daniel L. and Vermunt, Jeroen K.
Subjects: Statistics - Methodology
Abstract: Binary data latent class models crucially assume local independence, violations of which can seriously bias the results. We present two tools for monitoring local dependence in binary data latent class models: the "Expected Parameter Change" (EPC) and a generalized EPC, estimating the substantive size and direction of possible local dependencies. The asymptotic and finite sample behavior of the measures is studied, and two applications to the U.S. Census estimation of Hispanic ethnicity and medical experts' ratings of x-rays demonstrate its value in arriving at a model that balances realism and parsimony., Comment: R code implementing our proposal and including both example datasets is available online as supplementary material
Published: 2018

18. Privacy-preserving local analysis of digital trace data: A proof-of-concept

Author: Boeschoten, Laura, Mendrik, Adriënne, van der Veen, Emiel, Vloothuis, Jeroen, Hu, Haili, Voorvaart, Roos, and Oberski, Daniel L.
Published: 2022
Full Text: View/download PDF

19. A test for monitoring under- and overtreatment in Dutch hospitals

Author: Lenz, Oliver Urs and Oberski, Daniel L
Subjects: Statistics - Applications
Abstract: Over- and undertreatment harm patients and society and confound other healthcare quality measures. Despite a growing body of research covering specific conditions, we lack tools to systematically detect and measure over- and undertreatment in hospitals. We demonstrate a test used to monitor over- and undertreatment in Dutch hospitals, and illustrate its results applied to the aggregated administrative treatment data of 1,836,349 patients at 89 hospitals in 2013. We employ a random effects model to create risk-adjusted funnel plots that account for natural variation among hospitals, allowing us to estimate a measure of overtreatment and undertreatment when hospitals fall outside the control limits. The results of this test are not definitive, findings were discussed with hospitals to improve the model and to enable the hospitals to make informed treatment decisions.
Published: 2017

20. PATCH -- Psychometrics-AssisTed benCHmarking of Large Language Models: A Case Study of Mathematics Proficiency

Author: Fang, Qixiang, Oberski, Daniel L., Nguyen, Dong, Fang, Qixiang, Oberski, Daniel L., and Nguyen, Dong
Abstract: Many existing benchmarks of large (multimodal) language models (LLMs) focus on measuring LLMs' academic proficiency, often with also an interest in comparing model performance with human test takers. While these benchmarks have proven key to the development of LLMs, they suffer from several limitations, including questionable measurement quality (e.g., Do they measure what they are supposed to in a reliable way?), lack of quality assessment on the item level (e.g., Are some items more important or difficult than others?) and unclear human population reference (e.g., To whom can the model be compared?). In response to these challenges, we propose leveraging knowledge from psychometrics - a field dedicated to the measurement of latent variables like academic proficiency - into LLM benchmarking. We make three primary contributions. First, we introduce PATCH: a novel framework for Psychometrics-AssisTed benCHmarking of LLMs. PATCH addresses the aforementioned limitations, presenting a new direction for LLM benchmark research. Second, we implement PATCH by measuring GPT-4 and Gemini-Pro-Vision's proficiency in 8th grade mathematics against 56 human populations. We show that adopting a psychometrics-based approach yields evaluation outcomes that diverge from those based on existing benchmarking practices. Third, we release 4 datasets to support measuring and comparing LLM proficiency in grade school mathematics and science against human populations.
Published: 2024

21. An open source machine learning framework for efficient and transparent systematic reviews

Author: van de Schoot, Rens, de Bruin, Jonathan, Schram, Raoul, Zahedi, Parisa, de Boer, Jan, Weijdema, Felix, Kramer, Bianca, Huijts, Martijn, Hoogerwerf, Maarten, Ferdinands, Gerbrich, Harkema, Albert, Willemsen, Joukje, Ma, Yongchao, Fang, Qixiang, Hindriks, Sybren, Tummers, Lars, and Oberski, Daniel L.
Published: 2021
Full Text: View/download PDF

22. Rank-deficiencies in a reduced information latent variable model

Author: Oberski, Daniel L., primary
Published: 2021
Full Text: View/download PDF

23. ETM: Enrichment by topic modeling for automated clinical sentence classification to detect patients’ disease history

Author: Bagheri, Ayoub, Sammani, Arjan, van der Heijden, Peter G. M., Asselbergs, Folkert W., and Oberski, Daniel L.
Published: 2020
Full Text: View/download PDF

24. Shrinkage priors for Bayesian penalized regression

Author: van Erp, Sara, Oberski, Daniel L., and Mulder, Joris
Published: 2019
Full Text: View/download PDF

25. Bayesian multivariate control charts for multivariate profiles monitoring.

Author: Ahmadi Yazdi, Ahmad, Shafiee Kamalabad, Mahdi, Oberski, Daniel L., and Grzegorczyk, Marco
Subjects: QUALITY control charts, TOPICAL drug administration, REGRESSION analysis, PRODUCT quality
Abstract: In many topical applications, the product's quality can be well described in terms of statistical regression relationships between one or more response and a set of explanatory variables. In the literature, various types of regression models have been proposed for profile monitoring applications, and each of those regression models can be implemented and applied in its standard frequentist's and its Bayesian variant. We formulate two popular Phase II multivariate cumulative sum control charts for monitoring multivariate linear profiles in terms of Bayesian regression models, and we show empirically that the resulting new Bayesian control charts perform better than the corresponding non-Bayesian control charts. For the comparative evaluation of the control charts we employ the average run length criterion. Moreover, we propose a new Bayesian approach, which we refer to as the informative prior generation method. The key idea of this method is to make use of historical datasets to generate informative prior distributions. The advantage of this method is that we do not ignore the historical data from Phase I. Instead we re-use it to construct informative prior distributions for Phase II monitoring. The applicability and the superiority of the proposed Bayesian control charts are illustrated through extensive simulation studies. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Understanding financial distress by using Markov random fields on linked administrative data

Author: Fonville, Floris, primary, Heijden, Peter G.M. van der, additional, Siebes, Arno P.J.M., additional, and Oberski, Daniel L., additional
Published: 2023
Full Text: View/download PDF

27. A robust mRNA signature obtained via recursive ensemble feature selection predicts the responsiveness of omalizumab in moderate‐to‐severe asthma

Author: Kidwai, Sarah, primary, Barbiero, Pietro, additional, Meijerman, Irma, additional, Tonda, Alberto, additional, Perez‐Pardo, Paula, additional, Lio ́, Pietro, additional, van der Maitland‐Zee, Anke H., additional, Oberski, Daniel L., additional, Kraneveld, Aletta D., additional, and Lopez‐Rincon, Alejandro, additional
Published: 2023
Full Text: View/download PDF

28. Statistical challenges of administrative and transaction data

Author: Hand, David J., Babb, Penny, Zhang, Li-Chun, Allin, Paul, Wallgren, Anders, Wallgren, Britt, Blunt, Gordon, Garrett, Andrew, Murtagh, Fionn, Smith, Peter W. F., Elliott, Duncan, Nason, Guy, Powell, Ben, Moore, Jamie C., Durrant, Gabriele B., Smith, Paul A., Chambers, Raymond L., Herzberg, Agnes M., Pilling, Mark, Appleby, Wendy, Barnett, Arthur, Bhansali, Rajendra, Bharadwaj, Neeraj, Dong, Yuexiao, van den Brakel, J. A., Budd, Lisa, Doidge, James, Gilbert, Ruth, Francis, Brian, Frisoli, Kayla, Nugent, Rebecca, Perez, Francisco Javier García, Lara, Libia, Porcu, Emilio, Henry, Sarah, Hunt, Ian, Ieva, Francesca, Gasperoni, Francesca, Jansson, Ingegerd, Kumar, Kuldeep, Longford, Nick, Manninen, Asta, Mateu, Jorge, McNicholas, Paul D., McNicholas, Sharon M., Tait, Peter A., Mehew, Jenny, Oberski, Daniel L., Ruiz, Marcelo, Yohai, Victor J., Zamar, Ruben, Stehlík, Milan, Stehlíková, Silvia, Soza, Ludy Núñez, Towers, Jude, and Wijayatunga, Priyantha
Published: 2018

29. Developing a personalized remote patient monitoring algorithm: a proof-of-concept in heart failure

Author: Moazeni, Mehran, primary, Numan, Lieke, additional, Brons, Maaike, additional, Houtgraaf, Jaco, additional, Rutten, Frans H, additional, Oberski, Daniel L, additional, van Laake, Linda W, additional, Asselbergs, Folkert W, additional, and Aarts, Emmeke, additional
Published: 2023
Full Text: View/download PDF

30. Identifying multivariate disease trajectories and potential phenotypes of early knee osteoarthritis in the CHECK cohort

Author: Altamirano, Sara, primary, Jansen, Mylène P., additional, Oberski, Daniel L., additional, Eijkemans, Marinus J. C., additional, Mastbergen, Simon C., additional, Lafeber, Floris P. J. G., additional, van Spil, Willem E., additional, and Welsing, Paco M. J., additional
Published: 2023
Full Text: View/download PDF

31. Life-threatening ventricular arrhythmia prediction in patients with dilated cardiomyopathy using explainable electrocardiogram-based deep neural networks

Author: Sammani, Arjan, van de Leur, Rutger R., Henkens, Michiel T. H. M., Meine, Mathias, Loh, Peter, Hassink, Rutger J., Oberski, Daniel L., Heymans, Stephane R. B., Doevendans, Pieter A., Asselbergs, Folkert W., te Riele, Anneline S. J. M., van Es, Rene, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Cardiologie, RS: Carim - H02 Cardiomyopathy, MUMC+: MA Med Staf Spec Cardiologie (9), Leerstoel Oberski, and Methodology and statistics for the behavioural and social sciences
Subjects: Cardiomyopathy, Dilated, Male, Science & Technology, Cardiac & Cardiovascular Systems, Dilated cardiomyopathy, Arrhythmias, Cardiac, Stroke Volume, Middle Aged, Deep neural network, Prognosis, Ventricular Function, Left, Implantable cardioverter-defibrillator, Defibrillators, Implantable, Electrocardiography, Sudden cardiac death, Death, Sudden, Cardiac, Risk Factors, Physiology (medical), Cardiovascular System & Cardiology, Humans, HEART, Female, Neural Networks, Computer, Cardiology and Cardiovascular Medicine, Life Sciences & Biomedicine
Abstract: Aims While electrocardiogram (ECG) characteristics have been associated with life-threatening ventricular arrhythmias (LTVA) in dilated cardiomyopathy (DCM), they typically rely on human-derived parameters. Deep neural networks (DNNs) can discover complex ECG patterns, but the interpretation is hampered by their ‘black-box’ characteristics. We aimed to detect DCM patients at risk of LTVA using an inherently explainable DNN. Methods and results In this two-phase study, we first developed a variational autoencoder DNN on more than 1 million 12-lead median beat ECGs, compressing the ECG into 21 different factors (F): FactorECG. Next, we used two cohorts with a combined total of 695 DCM patients and entered these factors in a Cox regression for the composite LTVA outcome, which was defined as sudden cardiac arrest, spontaneous sustained ventricular tachycardia, or implantable cardioverter-defibrillator treated ventricular arrhythmia. Most patients were male (n = 442, 64%) with a median age of 54 years [interquartile range (IQR) 44–62], and median left ventricular ejection fraction of 30% (IQR 23–39). A total of 115 patients (16.5%) reached the study outcome. Factors F8 (prolonged PR-interval and P-wave duration, P Conclusion Inherently explainable DNNs can detect patients at risk of LTVA which is mainly driven by P-wave abnormalities.
Published: 2022

32. Data science: een waanzinnige wetenschap

Author: Oberski, Daniel L.
Abstract: Inaugural lecture given on June 16th, 2023 by the author by way of acceptance of the title of Professor in Social & Health Data Science at Utrecht University, The Netherlands. The current document is the Dutch-language version. A translation into English is forthcoming.
Published: 2023
Full Text: View/download PDF

33. Machine Learning—Evaluation (Cross-validation, Metrics, Importance Scores...)

Author: Sub Data Intensive Systems, Data Intensive Systems, Asselbergs, Folkert W., Denaxas, Spiros, Oberski, Daniel L., Moore, Jason H., Qahtan, Hakim, Sub Data Intensive Systems, Data Intensive Systems, Asselbergs, Folkert W., Denaxas, Spiros, Oberski, Daniel L., Moore, Jason H., and Qahtan, Hakim
Published: 2023

34. On Text-based Personality Computing: Challenges and Future Directions

Author: Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Fang, Qixiang, Giachanou, Anastasia, Bagheri, Ayoub, Boeschoten, Laura, van Kesteren, Erik Jan, Kamalabad, Mahdi Shafiee, Oberski, Daniel L., Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Fang, Qixiang, Giachanou, Anastasia, Bagheri, Ayoub, Boeschoten, Laura, van Kesteren, Erik Jan, Kamalabad, Mahdi Shafiee, and Oberski, Daniel L.
Published: 2023

35. Identifying multivariate disease trajectories and potential phenotypes of early knee osteoarthritis in the CHECK cohort

Author: Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Altamirano, Sara, Jansen, Mylène P, Oberski, Daniel L, Eijkemans, Marinus J C, Mastbergen, Simon C, Lafeber, Floris P J G, van Spil, Willem E, Welsing, Paco M J, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Altamirano, Sara, Jansen, Mylène P, Oberski, Daniel L, Eijkemans, Marinus J C, Mastbergen, Simon C, Lafeber, Floris P J G, van Spil, Willem E, and Welsing, Paco M J
Published: 2023

36. A robust mRNA signature obtained via recursive ensemble feature selection predicts the responsiveness of omalizumab in moderate-to-severe asthma

Author: Pharmacology, Afd Pharmacology, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Kidwai, Sarah, Barbiero, Pietro, Meijerman, Irma, Tonda, Alberto, Perez-Pardo, Paula, Lio, Pietro, van der Maitland-Zee, Anke H, Oberski, Daniel L, Kraneveld, Aletta D, Lopez-Rincon, Alejandro, Pharmacology, Afd Pharmacology, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Kidwai, Sarah, Barbiero, Pietro, Meijerman, Irma, Tonda, Alberto, Perez-Pardo, Paula, Lio, Pietro, van der Maitland-Zee, Anke H, Oberski, Daniel L, Kraneveld, Aletta D, and Lopez-Rincon, Alejandro
Published: 2023

37. Performance of active learning models for screening prioritization in systematic reviews: a simulation study into the Average Time to Discover relevant records

Author: Methodology and statistics for the behavioural and social sciences, Research & Data Management Services, Leerstoel Oberski, Public management en gedrag, Leerstoel Schoot, Ferdinands, Gerbrich, Schram, Raoul, de Bruin, Jonathan, Bagheri, Ayoub, Oberski, Daniel L., Tummers, Lars, Teijema, Jelle Jasper, van de Schoot, Rens, Methodology and statistics for the behavioural and social sciences, Research & Data Management Services, Leerstoel Oberski, Public management en gedrag, Leerstoel Schoot, Ferdinands, Gerbrich, Schram, Raoul, de Bruin, Jonathan, Bagheri, Ayoub, Oberski, Daniel L., Tummers, Lars, Teijema, Jelle Jasper, and van de Schoot, Rens
Published: 2023

38. Developing a personalized remote patient monitoring algorithm: a proof-of-concept in heart failure

Author: Leerstoel Klugkist, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Moazeni, Mehran, Numan, Lieke, Brons, Maaike, Houtgraaf, Jaco, Rutten, Frans H, Oberski, Daniel L, Laake, Linda W van, Asselbergs, Folkert W, Aarts, Emmeke, Leerstoel Klugkist, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Moazeni, Mehran, Numan, Lieke, Brons, Maaike, Houtgraaf, Jaco, Rutten, Frans H, Oberski, Daniel L, Laake, Linda W van, Asselbergs, Folkert W, and Aarts, Emmeke
Published: 2023

39. Why Measurement Invariance is Important in Comparative Research. A Response to Welzel et al. (2021)

Author: Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Meuleman, Bart, Żółtak, Tomasz, Pokropek, Artur, Davidov, Eldad, Muthén, Bengt, Oberski, Daniel L., Billiet, Jaak, Schmidt, Peter, Leerstoel Oberski, Methodology and statistics for the behavioural and social sciences, Meuleman, Bart, Żółtak, Tomasz, Pokropek, Artur, Davidov, Eldad, Muthén, Bengt, Oberski, Daniel L., Billiet, Jaak, and Schmidt, Peter
Published: 2023

40. Developing a personalized remote patient monitoring algorithm: a proof-of-concept in heart failure

Author: Onderzoek Precision medicine, Verpleegkundig Specialisten Cardiologie, HAG Hart- Vaatziekten, Circulatory Health, JC onderzoeksprogramma Cardiovasculaire Epidemiologie, Datascience, Regenerative Medicine and Stem Cells, Team Medisch, Moazeni, Mehran, Numan, Lieke, Brons, Maaike, Houtgraaf, Jaco, Rutten, Frans H, Oberski, Daniel L, van Laake, Linda W, Asselbergs, Folkert W, Aarts, Emmeke, Onderzoek Precision medicine, Verpleegkundig Specialisten Cardiologie, HAG Hart- Vaatziekten, Circulatory Health, JC onderzoeksprogramma Cardiovasculaire Epidemiologie, Datascience, Regenerative Medicine and Stem Cells, Team Medisch, Moazeni, Mehran, Numan, Lieke, Brons, Maaike, Houtgraaf, Jaco, Rutten, Frans H, Oberski, Daniel L, van Laake, Linda W, Asselbergs, Folkert W, and Aarts, Emmeke
Published: 2023

41. Identifying multivariate disease trajectories and potential phenotypes of early knee osteoarthritis in the CHECK cohort

Author: Onderzoek Reumatologie, Infection & Immunity, Regenerative Medicine and Stem Cells, Datascience, Circulatory Health, Biostatistiek Onderzoek, MS Reumatologie/Immunologie/Infectie, Lab Reumatologie/Klinische Immunologie, JC onderzoeksprogramma Methodologie, Altamirano, Sara, Jansen, Mylène P., Oberski, Daniel L., Eijkemans, Marinus J.C., Mastbergen, Simon C., Lafeber, Floris P.J.G., van Spil, Willem E., Welsing, Paco M.J., Onderzoek Reumatologie, Infection & Immunity, Regenerative Medicine and Stem Cells, Datascience, Circulatory Health, Biostatistiek Onderzoek, MS Reumatologie/Immunologie/Infectie, Lab Reumatologie/Klinische Immunologie, JC onderzoeksprogramma Methodologie, Altamirano, Sara, Jansen, Mylène P., Oberski, Daniel L., Eijkemans, Marinus J.C., Mastbergen, Simon C., Lafeber, Floris P.J.G., van Spil, Willem E., and Welsing, Paco M.J.
Published: 2023

42. Why Measurement Invariance is Important in Comparative Research. A Response to Welzel et al. (2021)

Author: Datascience, Circulatory Health, Meuleman, Bart, Żółtak, Tomasz, Pokropek, Artur, Davidov, Eldad, Muthén, Bengt, Oberski, Daniel L., Billiet, Jaak, Schmidt, Peter, Datascience, Circulatory Health, Meuleman, Bart, Żółtak, Tomasz, Pokropek, Artur, Davidov, Eldad, Muthén, Bengt, Oberski, Daniel L., Billiet, Jaak, and Schmidt, Peter
Published: 2023

43. Why Measurement Invariance is Important in Comparative Research. A Response to Welzel et al. (2021)

Author: Meuleman, Bart; https://orcid.org/0000-0002-0384-5995, Żółtak, Tomasz; https://orcid.org/0000-0003-1354-4472, Pokropek, Artur; https://orcid.org/0000-0002-5899-2917, Davidov, Eldad; https://orcid.org/0000-0002-3396-969X, Muthén, Bengt, Oberski, Daniel L, Billiet, Jaak, Schmidt, Peter; https://orcid.org/0000-0001-6954-8590, Meuleman, Bart; https://orcid.org/0000-0002-0384-5995, Żółtak, Tomasz; https://orcid.org/0000-0003-1354-4472, Pokropek, Artur; https://orcid.org/0000-0002-5899-2917, Davidov, Eldad; https://orcid.org/0000-0002-3396-969X, Muthén, Bengt, Oberski, Daniel L, Billiet, Jaak, and Schmidt, Peter; https://orcid.org/0000-0001-6954-8590
Abstract: Welzel et al. (2021) claim that non-invariance of instruments is inconclusive and inconsequential in the field for cross-cultural value measurement. In this response, we contend that several key arguments on which Welzel et al. (2021) base their critique of invariance testing are conceptually and statistically incorrect. First, Welzel et al. (2021) claim that value measurement follows a formative rather than reflective logic. Yet they do not provide sufficient theoretical arguments for this conceptualization, nor do they discuss the disadvantages of this approach for validation of instruments. Second, their claim that strong inter-item correlations cannot be retrieved when means are close to the endpoint of scales ignores the existence of factor-analytic approaches for ordered-categorical indicators. Third, Welzel et al. (2021) propose that rather than of relying on invariance tests, comparability can be assessed by studying the connection with theoretically related constructs. However, their proposal ignores that external validation through nomological linkages hinges on the assumption of comparability. By means of two examples, we illustrate that violating the assumptions of measurement invariance can distort conclusions substantially. Following the advice of Welzel et al. (2021) implies discarding a tool that has proven to be very useful for comparativists.
Published: 2023

44. Designing and Evaluating General-Purpose User Representations Based on Behavioral Logs from a Measurement Process Perspective: A Case Study with Snapchat

Author: Fang, Qixiang, Zhou, Zhihan, Barbieri, Francesco, Liu, Yozen, Neves, Leonardo, Nguyen, Dong, Oberski, Daniel L., Bos, Maarten W., Dotsch, Ron, Fang, Qixiang, Zhou, Zhihan, Barbieri, Francesco, Liu, Yozen, Neves, Leonardo, Nguyen, Dong, Oberski, Daniel L., Bos, Maarten W., and Dotsch, Ron
Abstract: In human-computer interaction, understanding user behaviors and tailoring systems accordingly is pivotal. To this end, general-purpose user representation learning based on behavior logs is emerging as a powerful tool in user modeling, offering adaptability to various downstream tasks such as item recommendations and ad conversion prediction, without the need to fine-tune the upstream user model. While this methodology has shown promise in contexts like search engines and e-commerce platforms, its fit for instant messaging apps, a cornerstone of modern digital communication, remains largely uncharted. These apps, with their distinct interaction patterns, data structures, and user expectations, necessitate specialized attention. We explore this user modeling approach with Snapchat data as a case study. Furthermore, we introduce a novel design and evaluation framework rooted in the principles of the Measurement Process Framework from social science research methodology. Using this new framework, we design a Transformer-based user model that can produce high-quality general-purpose user representations for instant messaging platforms like Snapchat.
Published: 2023

45. Problems in detecting misfit of latent class models in diagnostic research without a gold standard were shown

Author: van Smeden, Maarten, Oberski, Daniel L., Reitsma, Johannes B., Vermunt, Jeroen K., Moons, Karel G.M., and de Groot, Joris A.H.
Published: 2016
Full Text: View/download PDF

46. GOODNESS-OF-FIT OF MULTILEVEL LATENT CLASS MODELS FOR CATEGORICAL DATA

Author: Nagelkerke, Erwin, Oberski, Daniel L., and Vermunt, Jeroen K.
Published: 2016

47. Sensitivity Analysis

Author: Oberski, Daniel L., primary
Published: 2018
Full Text: View/download PDF

48. Evaluating Measurement Invariance in Categorical Data Latent Variable Models with the EPC-Interest

Author: Oberski, Daniel L., Vermunt, Jeroen K., and Moors, Guy B. D.
Published: 2015

49. The effect of individual characteristics on reports of socially desirable attitudes toward immigration

Author: Oberski, Daniel L., Weber, Wiebke, Révilla, Mélanie, Salzborn, Samuel, editor, Davidov, Eldad, editor, and Reinecke, Jost, editor
Published: 2012
Full Text: View/download PDF

50. Comparability of Survey Measurements

Author: Oberski, Daniel L. and Gideon, Lior, editor
Published: 2012
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

346 results on '"Oberski, Daniel L"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources