Author: "Bertsimas, Dimitris" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Bertsimas, Dimitris"' showing total 1,679 results

Start Over Author "Bertsimas, Dimitris"

1,679 results on '"Bertsimas, Dimitris"'

1. Deep Trees for (Un)structured Data: Tractability, Performance, and Interpretability

Author: Bertsimas, Dimitris, Everest, Lisa, Gu, Jiayi, Peroni, Matthew, and Stoumpou, Vasiliki
Subjects: Computer Science - Machine Learning
Abstract: Decision Trees have remained a popular machine learning method for tabular datasets, mainly due to their interpretability. However, they lack the expressiveness needed to handle highly nonlinear or unstructured datasets. Motivated by recent advances in tree-based machine learning (ML) techniques and first-order optimization methods, we introduce Generalized Soft Trees (GSTs), which extend soft decision trees (STs) and are capable of processing images directly. We demonstrate their advantages with respect to tractability, performance, and interpretability. We develop a tractable approach to growing GSTs, given by the DeepTree algorithm, which, in addition to new regularization terms, produces high-quality models with far fewer nodes and greater interpretability than traditional soft trees. We test the performance of our GSTs on benchmark tabular and image datasets, including MIMIC-IV, MNIST, Fashion MNIST, CIFAR-10 and Celeb-A. We show that our approach outperforms other popular tree methods (CART, Random Forests, XGBoost) in almost all of the datasets, with Convolutional Trees having a significant edge in the hardest CIFAR-10 and Fashion MNIST datasets. Finally, we explore the interpretability of our GSTs and find that even the most complex GSTs are considerably more interpretable than deep neural networks. Overall, our approach of Generalized Soft Trees provides a tractable method that is high-performing on (un)structured datasets and preserves interpretability more than traditional deep learning methods., Comment: Submitted to Machine Learning. Authors are listed in alphabetical order
Published: 2024

2. Binary Classification: Is Boosting stronger than Bagging?

Author: Bertsimas, Dimitris and Stoumpou, Vasiliki
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Random Forests have been one of the most popular bagging methods in the past few decades, especially due to their success at handling tabular datasets. They have been extensively studied and compared to boosting models, like XGBoost, which are generally considered more performant. Random Forests adopt several simplistic assumptions, such that all samples and all trees that form the forest are equally important for building the final model. We introduce Enhanced Random Forests, an extension of vanilla Random Forests with extra functionalities and adaptive sample and model weighting. We develop an iterative algorithm for adapting the training sample weights, by favoring the hardest examples, and an approach for finding personalized tree weighting schemes for each new sample. Our method significantly improves upon regular Random Forests across 15 different binary classification datasets and considerably outperforms other tree methods, including XGBoost, when run with default hyperparameters, which indicates the robustness of our approach across datasets, without the need for extensive hyperparameter tuning. Our tree-weighting methodology results in enhanced or comparable performance to the uniformly weighted ensemble, and is, more importantly, leveraged to define importance scores for trees based on their contributions to classifying each new sample. This enables us to only focus on a small number of trees as the main models that define the outcome of a new sample and, thus, to partially recover interpretability, which is critically missing from both bagging and boosting methods. In binary classification problems, the proposed extensions and the corresponding results suggest the equivalence of bagging and boosting methods in performance, and the edge of bagging in interpretability by leveraging a few learners of the ensemble, which is not an option in the less explainable boosting methods.
Published: 2024

3. Predictive Low Rank Matrix Learning under Partial Observations: Mixed-Projection ADMM

Author: Bertsimas, Dimitris and Johnson, Nicholas A. G.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We study the problem of learning a partially observed matrix under the low rank assumption in the presence of fully observed side information that depends linearly on the true underlying matrix. This problem consists of an important generalization of the Matrix Completion problem, a central problem in Statistics, Operations Research and Machine Learning, that arises in applications such as recommendation systems, signal processing, system identification and image denoising. We formalize this problem as an optimization problem with an objective that balances the strength of the fit of the reconstruction to the observed entries with the ability of the reconstruction to be predictive of the side information. We derive a mixed-projection reformulation of the resulting optimization problem and present a strong semidefinite cone relaxation. We design an efficient, scalable alternating direction method of multipliers algorithm that produces high quality feasible solutions to the problem of interest. Our numerical results demonstrate that in the small rank regime ($k \leq 15$), our algorithm outputs solutions that achieve on average $79\%$ lower objective value and $90.1\%$ lower $\ell_2$ reconstruction error than the solutions returned by the best performing benchmark method on synthetic data. The runtime of our algorithm is competitive with and often superior to that of the benchmark methods. Our algorithm is able to solve problems with $n = 10000$ rows and $m = 10000$ columns in less than a minute. On large scale real world data, our algorithm produces solutions that achieve $67\%$ lower out of sample error than benchmark methods in $97\%$ less execution time.
Published: 2024

4. Policy Trees for Prediction: Interpretable and Adaptive Model Selection for Machine Learning

Author: Bertsimas, Dimitris and Peroni, Matthew
Subjects: Computer Science - Machine Learning
Abstract: As a multitude of capable machine learning (ML) models become widely available in forms such as open-source software and public APIs, central questions remain regarding their use in real-world applications, especially in high-stakes decision-making. Is there always one best model that should be used? When are the models likely to be error-prone? Should a black-box or interpretable model be used? In this work, we develop a prescriptive methodology to address these key questions, introducing a tree-based approach, Optimal Predictive-Policy Trees (OP2T), that yields interpretable policies for adaptively selecting a predictive model or ensemble, along with a parameterized option to reject making a prediction. We base our methods on learning globally optimized prescriptive trees. Our approach enables interpretable and adaptive model selection and rejection while only assuming access to model outputs. By learning policies over different feature spaces, including the model outputs, our approach works with both structured and unstructured datasets. We evaluate our approach on real-world datasets, including regression and classification tasks with both structured and unstructured data. We demonstrate that our approach provides both strong performance against baseline methods while yielding insights that help answer critical questions about which models to use, and when., Comment: Submitted to JMLR on 5/30/2024
Published: 2024

5. Catastrophe Insurance: An Adaptive Robust Optimization Approach

Author: Bertsimas, Dimitris and Zeng, Cynthia
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning
Abstract: The escalating frequency and severity of natural disasters, exacerbated by climate change, underscore the critical role of insurance in facilitating recovery and promoting investments in risk reduction. This work introduces a novel Adaptive Robust Optimization (ARO) framework tailored for the calculation of catastrophe insurance premiums, with a case study applied to the United States National Flood Insurance Program (NFIP). To the best of our knowledge, it is the first time an ARO approach has been applied to for disaster insurance pricing. Our methodology is designed to protect against both historical and emerging risks, the latter predicted by machine learning models, thus directly incorporating amplified risks induced by climate change. Using the US flood insurance data as a case study, optimization models demonstrate effectiveness in covering losses and produce surpluses, with a smooth balance transition through parameter fine-tuning. Among tested optimization models, results show ARO models with conservative parameter values achieving low number of insolvent states with the least insurance premium charged. Overall, optimization frameworks offer versatility and generalizability, making it adaptable to a variety of natural disaster scenarios, such as wildfires, droughts, etc. This work not only advances the field of insurance premium modeling but also serves as a vital tool for policymakers and stakeholders in building resilience to the growing risks of natural catastrophes.
Published: 2024

6. M3H: Multimodal Multitask Machine Learning for Healthcare

Author: Bertsimas, Dimitris and Ma, Yu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Developing an integrated many-to-many framework leveraging multimodal data for multiple tasks is crucial to unifying healthcare applications ranging from diagnoses to operations. In resource-constrained hospital environments, a scalable and unified machine learning framework that improves previous forecast performances could improve hospital operations and save costs. We introduce M3H, an explainable Multimodal Multitask Machine Learning for Healthcare framework that consolidates learning from tabular, time-series, language, and vision data for supervised binary/multiclass classification, regression, and unsupervised clustering. It features a novel attention mechanism balancing self-exploitation (learning source-task), and cross-exploration (learning cross-tasks), and offers explainability through a proposed TIM score, shedding light on the dynamics of task learning interdependencies. M3H encompasses an unprecedented range of medical tasks and machine learning problem classes and consistently outperforms traditional single-task models by on average 11.6% across 40 disease diagnoses from 16 medical departments, three hospital operation forecasts, and one patient phenotyping task. The modular design of the framework ensures its generalizability in data processing, task definition, and rapid model prototyping, making it production ready for both clinical and operational healthcare settings, especially those in constrained environments.
Published: 2024

7. Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

Author: Bertsimas, Dimitris, Digalakis Jr, Vassilis, Ma, Yu, and Paschalidis, Phevos
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control
Abstract: We consider the task of retraining machine learning (ML) models when new batches of data become available. Existing methods focus largely on greedy approaches to find the best-performing model for each batch, without considering the stability of the model's structure across retraining iterations. In this study, we propose a methodology for finding sequences of ML models that are stable across retraining iterations. We develop a mixed-integer optimization formulation that is guaranteed to recover Pareto optimal models (in terms of the predictive power-stability trade-off) and an efficient polynomial-time algorithm that performs well in practice. We focus on retaining consistent analytical insights - which is important to model interpretability, ease of implementation, and fostering trust with users - by using custom-defined distance metrics that can be directly incorporated into the optimization problem. Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power, as evidenced through a real-world case study in a major hospital system in Connecticut.
Published: 2024

8. Global optimization: a machine learning approach

Author: Bertsimas, Dimitris and Margaritis, Georgios
Published: 2024
Full Text: View/download PDF

9. Compressed sensing: a discrete optimization approach

Author: Bertsimas, Dimitris and Johnson, Nicholas A. G.
Published: 2024
Full Text: View/download PDF

10. Adaptive Optimization for Prediction with Missing Data

Author: Bertsimas, Dimitris, Delarue, Arthur, and Pauphilet, Jean
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: When training predictive models on data with missing entries, the most widely used and versatile approach is a pipeline technique where we first impute missing entries and then compute predictions. In this paper, we view prediction with missing data as a two-stage adaptive optimization problem and propose a new class of models, adaptive linear regression models, where the regression coefficients adapt to the set of observed features. We show that some adaptive linear regression models are equivalent to learning an imputation rule and a downstream linear regression model simultaneously instead of sequentially. We leverage this joint-impute-then-regress interpretation to generalize our framework to non-linear models. In settings where data is strongly not missing at random, our methods achieve a 2-10% improvement in out-of-sample accuracy., Comment: arXiv admin note: text overlap with arXiv:2104.03158
Published: 2024

11. Universal Neurons in GPT2 Language Models

Author: Gurnee, Wes, Horsley, Theo, Guo, Zifan Carl, Kheirkhah, Tara Rezaei, Sun, Qinyi, Hathaway, Will, Nanda, Neel, and Bertsimas, Dimitris
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: A basic question within the emerging field of mechanistic interpretability is the degree to which neural networks learn the same underlying mechanisms. In other words, are neural mechanisms universal across different models? In this work, we study the universality of individual neurons across GPT2 models trained from different initial random seeds, motivated by the hypothesis that universal neurons are likely to be interpretable. In particular, we compute pairwise correlations of neuron activations over 100 million tokens for every neuron pair across five different seeds and find that 1-5\% of neurons are universal, that is, pairs of neurons which consistently activate on the same inputs. We then study these universal neurons in detail, finding that they usually have clear interpretations and taxonomize them into a small number of neuron families. We conclude by studying patterns in neuron weights to establish several universal functional roles of neurons in simple circuits: deactivating attention heads, changing the entropy of the next token distribution, and predicting the next token to (not) be within a particular set.
Published: 2024

12. Robust Regression over Averaged Uncertainty

Author: Bertsimas, Dimitris and Ma, Yu
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: We propose a new formulation of robust regression by integrating all realizations of the uncertainty set and taking an averaged approach to obtain the optimal solution for the ordinary least squares regression problem. We show that this formulation recovers ridge regression exactly and establishes the missing link between robust optimization and the mean squared error approaches for existing regression problems. We further demonstrate that the condition of this equivalence relies on the geometric properties of the defined uncertainty set. We provide exact, closed-form, in some cases, analytical solutions to the equivalent regularization strength under uncertainty sets induced by $\ell_p$ norm, Schatten $p$-norm, and general polytopes. We then show in synthetic datasets with different levels of uncertainties, a consistent improvement of the averaged formulation over the existing worst-case formulation in out-of-sample performance. In real-world regression problems obtained from UCI datasets, similar improvements are seen in the out-of-sample datasets.
Published: 2023

13. Global Optimization: A Machine Learning Approach

Author: Bertsimas, Dimitris and Margaritis, Georgios
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning
Abstract: Many approaches for addressing Global Optimization problems typically rely on relaxations of nonlinear constraints over specific mathematical primitives. This is restricting in applications with constraints that are black-box, implicit or consist of more general primitives. Trying to address such limitations, Bertsimas and Ozturk (2023) proposed OCTHaGOn as a way of solving black-box global optimization problems by approximating the nonlinear constraints using hyperplane-based Decision-Trees and then using those trees to construct a unified mixed integer optimization (MIO) approximation of the original problem. We provide extensions to this approach, by (i) approximating the original problem using other MIO-representable ML models besides Decision Trees, such as Gradient Boosted Trees, Multi Layer Perceptrons and Suport Vector Machines, (ii) proposing adaptive sampling procedures for more accurate machine learning-based constraint approximations, (iii) utilizing robust optimization to account for the uncertainty of the sample-dependent training of the ML models, and (iv) leveraging a family of relaxations to address the infeasibilities of the final MIO approximation. We then test the enhanced framework in 81 Global Optimization instances. We show improvements in solution feasibility and optimality in the majority of instances. We also compare against BARON, showing improved optimality gaps or solution times in 11 instances., Comment: Submitted to the Journal of Global Optimization. 35 pages
Published: 2023

14. The R.O.A.D. to precision medicine

Author: Bertsimas, Dimitris, Koulouras, Angelos G., and Margonis, Georgios Antonios
Subjects: Statistics - Applications, Computer Science - Artificial Intelligence, Statistics - Methodology
Abstract: We propose a prognostic stratum matching framework that addresses the deficiencies of Randomized trial data subgroup analysis and transforms ObservAtional Data to be used as if they were randomized, thus paving the road for precision medicine. Our approach counters the effects of unobserved confounding in observational data by correcting the estimated probabilities of the outcome under a treatment through a novel two-step process. These probabilities are then used to train Optimal Policy Trees (OPTs), which are decision trees that optimally assign treatments to subgroups of patients based on their characteristics. This facilitates the creation of clinically intuitive treatment recommendations. We applied our framework to observational data of patients with gastrointestinal stromal tumors (GIST) and validated the OPTs in an external cohort using the sensitivity and specificity metrics. We show that these recommendations outperformed those of experts in GIST. We further applied the same framework to randomized clinical trial (RCT) data of patients with extremity sarcomas. Remarkably, despite the initial trial results suggesting that all patients should receive treatment, our framework, after addressing imbalances in patient distribution due to the trial's small sample size, identified through the OPTs a subset of patients with unique characteristics who may not require treatment. Again, we successfully validated our recommendations in an external cohort.
Published: 2023

15. Adaptive Pricing in Unit Commitment Under Load and Capacity Uncertainty

Author: Bertsimas, Dimitris and Koulouras, Angelos G.
Subjects: Mathematics - Optimization and Control
Abstract: The increase of renewables in the grid and the volatility of the load create uncertainties in the day-ahead prices of electricity markets. Adaptive robust optimization (ARO) and stochastic optimization have been used to make commitment and dispatch decisions that adapt to the load and capacity uncertainty. These approaches have been successfully applied in practice but current pricing approaches used by US Independent System Operators (marginal pricing) and proposed in the literature (convex hull pricing) have two major disadvantages: a) they are deterministic in nature, that is they do not adapt to the load and capacity uncertainty, and b) require uplift payments to the generators that are typically determined by ad hoc procedures and create inefficiencies that motivate self-scheduling. In this work, we extend pay-as-bid and uniform pricing mechanisms to propose the first adaptive pricing method in electricity markets that adapts to the load and capacity uncertainty, eliminates post-market uplifts and deters self-scheduling, addressing both disadvantages.
Published: 2023

16. A Machine Learning Approach to Two-Stage Adaptive Robust Optimization

Author: Bertsimas, Dimitris and Kim, Cheol Woo
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: We propose an approach based on machine learning to solve two-stage linear adaptive robust optimization (ARO) problems with binary here-and-now variables and polyhedral uncertainty sets. We encode the optimal here-and-now decisions, the worst-case scenarios associated with the optimal here-and-now decisions, and the optimal wait-and-see decisions into what we denote as the strategy. We solve multiple similar ARO instances in advance using the column and constraint generation algorithm and extract the optimal strategies to generate a training set. We train a machine learning model that predicts high-quality strategies for the here-and-now decisions, the worst-case scenarios associated with the optimal here-and-now decisions, and the wait-and-see decisions. We also introduce an algorithm to reduce the number of different target classes the machine learning algorithm needs to be trained on. We apply the proposed approach to the facility location, the multi-item inventory control and the unit commitment problems. Our approach solves ARO problems drastically faster than the state-of-the-art algorithms with high accuracy.
Published: 2023

17. Optimal Control of Multiclass Fluid Queueing Networks: A Machine Learning Approach

Author: Bertsimas, Dimitris and Kim, Cheol Woo
Subjects: Computer Science - Machine Learning
Abstract: We propose a machine learning approach to the optimal control of multiclass fluid queueing networks (MFQNETs) that provides explicit and insightful control policies. We prove that a threshold type optimal policy exists for MFQNET control problems, where the threshold curves are hyperplanes passing through the origin. We use Optimal Classification Trees with hyperplane splits (OCT-H) to learn an optimal control policy for MFQNETs. We use numerical solutions of MFQNET control problems as a training set and apply OCT-H to learn explicit control policies. We report experimental results with up to 33 servers and 99 classes that demonstrate that the learned policies achieve 100\% accuracy on the test set. While the offline training of OCT-H can take days in large networks, the online application takes milliseconds.
Published: 2023

18. Compressed Sensing: A Discrete Optimization Approach

Author: Bertsimas, Dimitris and Johnson, Nicholas A. G.
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study the Compressed Sensing (CS) problem, which is the problem of finding the most sparse vector that satisfies a set of linear measurements up to some numerical tolerance. We introduce an $\ell_2$ regularized formulation of CS which we reformulate as a mixed integer second order cone program. We derive a second order cone relaxation of this problem and show that under mild conditions on the regularization parameter, the resulting relaxation is equivalent to the well studied basis pursuit denoising problem. We present a semidefinite relaxation that strengthens the second order cone relaxation and develop a custom branch-and-bound algorithm that leverages our second order cone relaxation to solve small-scale instances of CS to certifiable optimality. When compared against solutions produced by three state of the art benchmark methods on synthetic data, our numerical results show that our approach produces solutions that are on average $6.22\%$ more sparse. When compared only against the experiment-wise best performing benchmark method on synthetic data, our approach produces solutions that are on average $3.10\%$ more sparse. On real world ECG data, for a given $\ell_2$ reconstruction error our approach produces solutions that are on average $9.95\%$ more sparse than benchmark methods ($3.88\%$ more sparse if only compared against the best performing benchmark), while for a given sparsity level our approach produces solutions that have on average $10.77\%$ lower reconstruction error than benchmark methods ($1.42\%$ lower error if only compared against the best performing benchmark). When used as a component of a multi-label classification algorithm, our approach achieves greater classification accuracy than benchmark compressed sensing methods. This improved accuracy comes at the cost of an increase in computation time by several orders of magnitude.
Published: 2023

19. Pareto Adaptive Robust Optimality via a Fourier–Motzkin Elimination lens

Author: Bertsimas, Dimitris, ten Eikelder, Stefan C. M., den Hertog, Dick, and Trichakis, Nikolaos
Published: 2024
Full Text: View/download PDF

20. Improving Stability in Decision Tree Models

Author: Bertsimas, Dimitris and Digalakis Jr, Vassilis
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: Owing to their inherently interpretable structure, decision trees are commonly used in applications where interpretability is essential. Recent work has focused on improving various aspects of decision trees, including their predictive power and robustness; however, their instability, albeit well-documented, has been addressed to a lesser extent. In this paper, we take a step towards the stabilization of decision tree models through the lens of real-world health care applications due to the relevance of stability and interpretability in this space. We introduce a new distance metric for decision trees and use it to determine a tree's level of stability. We propose a novel methodology to train stable decision trees and investigate the existence of trade-offs that are inherent to decision tree models - including between stability, predictive power, and interpretability. We demonstrate the value of the proposed methodology through an extensive quantitative and qualitative analysis of six case studies from real-world health care applications, and we show that, on average, with a small 4.6% decrease in predictive power, we gain a significant 38% improvement in the model's stability.
Published: 2023

21. Patient Outcome Predictions Improve Operations at a Large Hospital Network

Author: Na, Liangyuan, Carballo, Kimberly Villalobos, Pauphilet, Jean, Haddad-Sisakht, Ali, Kombert, Daniel, Boisjoli-Langlois, Melissa, Castiglione, Andrew, Khalifa, Maram, Hebbal, Pooja, Stein, Barry, and Bertsimas, Dimitris
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Problem definition: Access to accurate predictions of patients' outcomes can enhance medical staff's decision-making, which ultimately benefits all stakeholders in the hospitals. A large hospital network in the US has been collaborating with academics and consultants to predict short-term and long-term outcomes for all inpatients across their seven hospitals. Methodology/results: We develop machine learning models that predict the probabilities of next 24-hr/48-hr discharge and intensive care unit transfers, end-of-stay mortality and discharge dispositions. All models achieve high out-of-sample AUC (75.7%-92.5%) and are well calibrated. In addition, combining 48-hr discharge predictions with doctors' predictions simultaneously enables more patient discharges (10%-28.7%) and fewer 7-day/30-day readmissions ($p$-value $<0.001$). We implement an automated pipeline that extracts data and updates predictions every morning, as well as user-friendly software and a color-coded alert system to communicate these patient-level predictions (alongside explanations) to clinical teams. Managerial implications: Since we have been gradually deploying the tool, and training medical staff, over 200 doctors, nurses, and case managers across seven hospitals use it in their daily patient review process. We observe a significant reduction in the average length of stay (0.67 days per patient) following its adoption and anticipate substantial financial benefits (between \$55 and \$72 million annually) for the healthcare system., Comment: 41 pages, 13 figures
Published: 2023

22. Optimal Low-Rank Matrix Completion: Semidefinite Relaxations and Eigenvector Disjunctions

Author: Bertsimas, Dimitris, Cory-Wright, Ryan, Lo, Sean, and Pauphilet, Jean
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Low-rank matrix completion consists of computing a matrix of minimal complexity that recovers a given set of observations as accurately as possible. Unfortunately, existing methods for matrix completion are heuristics that, while highly scalable and often identifying high-quality solutions, do not possess any optimality guarantees. We reexamine matrix completion with an optimality-oriented eye. We reformulate these low-rank problems as convex problems over the non-convex set of projection matrices and implement a disjunctive branch-and-bound scheme that solves them to certifiable optimality. Further, we derive a novel and often tight class of convex relaxations by decomposing a low-rank matrix as a sum of rank-one matrices and incentivizing that two-by-two minors in each rank-one matrix have determinant zero. In numerical experiments, our new convex relaxations decrease the optimality gap by two orders of magnitude compared to existing attempts, and our disjunctive branch-and-bound scheme solves nxn rank-r matrix completion problems to certifiable optimality in hours for n<=150 and r<=5., Comment: Updated version with new numerics showcasing relaxation for rank k>1
Published: 2023

23. Finding Neurons in a Haystack: Case Studies with Sparse Probing

Author: Gurnee, Wes, Nanda, Neel, Pauly, Matthew, Harvey, Katherine, Troitskii, Dmitrii, and Bertsimas, Dimitris
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Despite rapid adoption and deployment of large language models (LLMs), the internal computations of these models remain opaque and poorly understood. In this work, we seek to understand how high-level human-interpretable features are represented within the internal neuron activations of LLMs. We train $k$-sparse linear classifiers (probes) on these internal activations to predict the presence of features in the input; by varying the value of $k$ we study the sparsity of learned representations and how this varies with model scale. With $k=1$, we localize individual neurons which are highly relevant for a particular feature, and perform a number of case studies to illustrate general properties of LLMs. In particular, we show that early layers make use of sparse combinations of neurons to represent many features in superposition, that middle layers have seemingly dedicated neurons to represent higher-level contextual features, and that increasing scale causes representational sparsity to increase on average, but there are multiple types of scaling dynamics. In all, we probe for over 100 unique features comprising 10 different categories in 7 different models spanning 70 million to 6.9 billion parameters.
Published: 2023

24. Ensemble Modeling for Time Series Forecasting: an Adaptive Robust Optimization Approach

Author: Bertsimas, Dimitris and Boussioux, Leonard
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control
Abstract: Accurate time series forecasting is critical for a wide range of problems with temporal data. Ensemble modeling is a well-established technique for leveraging multiple predictive models to increase accuracy and robustness, as the performance of a single predictor can be highly variable due to shifts in the underlying data distribution. This paper proposes a new methodology for building robust ensembles of time series forecasting models. Our approach utilizes Adaptive Robust Optimization (ARO) to construct a linear regression ensemble in which the models' weights can adapt over time. We demonstrate the effectiveness of our method through a series of synthetic experiments and real-world applications, including air pollution management, energy consumption forecasting, and tropical cyclone intensity forecasting. Our results show that our adaptive ensembles outperform the best ensemble member in hindsight by 16-26% in root mean square error and 14-28% in conditional value at risk and improve over competitive ensemble techniques.
Published: 2023

25. Reducing Air Pollution through Machine Learning

Author: Bertsimas, Dimitris, Boussioux, Leonard, and Zeng, Cynthia
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: This paper presents a data-driven approach to mitigate the effects of air pollution from industrial plants on nearby cities by linking operational decisions with weather conditions. Our method combines predictive and prescriptive machine learning models to forecast short-term wind speed and direction and recommend operational decisions to reduce or pause the industrial plant's production. We exhibit several trade-offs between reducing environmental impact and maintaining production activities. The predictive component of our framework employs various machine learning models, such as gradient-boosted tree-based models and ensemble methods, for time series forecasting. The prescriptive component utilizes interpretable optimal policy trees to propose multiple trade-offs, such as reducing dangerous emissions by 33-47% and unnecessary costs by 40-63%. Our deployed models significantly reduced forecasting errors, with a range of 38-52% for less than 12-hour lead time and 14-46% for 12 to 48-hour lead time compared to official weather forecasts. We have successfully implemented the predictive component at the OCP Safi site, which is Morocco's largest chemical industrial plant, and are currently in the process of deploying the prescriptive component. Our framework enables sustainable industrial development by eliminating the pollution-industrial activity trade-off through data-driven weather-based operational decisions, significantly enhancing factory optimization and sustainability. This modernizes factory planning and resource allocation while maintaining environmental compliance. The predictive component has boosted production efficiency, leading to cost savings and reduced environmental impact by minimizing air pollution., Comment: Submitted to Manufacturing and Service Operations Management
Published: 2023

26. A Stochastic Benders Decomposition Scheme for Large-Scale Stochastic Network Design

Author: Bertsimas, Dimitris, Cory-Wright, Ryan, Pauphilet, Jean, and Petridis, Periklis
Subjects: Mathematics - Optimization and Control
Abstract: Network design problems involve constructing edges in a transportation or supply chain network to minimize construction and daily operational costs. We study a stochastic version where operational costs are uncertain due to fluctuating demand and estimated as a sample average from historical data. This problem is computationally challenging, and instances with as few as 100 nodes often cannot be solved to optimality using current decomposition techniques. We propose a stochastic variant of Benders decomposition that mitigates the high computational cost of generating each cut by sampling a subset of the data at each iteration and nonetheless generates deterministically valid cuts, rather than the probabilistically valid cuts frequently proposed in the stochastic optimization literature, via a dual averaging technique. We implement both single-cut and multi-cut variants of this Benders decomposition, as well as a variant that uses clustering of the historical scenarios. To our knowledge, this is the first single-tree implementation of Benders decomposition that facilitates sampling. On instances with 100-200 nodes and relatively complete recourse, our algorithm achieves 5-7% optimality gaps, compared with 16-27% for deterministic Benders schemes, and scales to instances with 700 nodes and 50 commodities within hours. Beyond network design, our strategy could be adapted to generic two-stage stochastic mixed-integer optimization problems where second-stage costs are estimated via a sample average.
Published: 2023

27. Multistage Stochastic Optimization via Kernels

Author: Bertsimas, Dimitris and Carballo, Kimberly Villalobos
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning
Abstract: We develop a non-parametric, data-driven, tractable approach for solving multistage stochastic optimization problems in which decisions do not affect the uncertainty. The proposed framework represents the decision variables as elements of a reproducing kernel Hilbert space and performs functional stochastic gradient descent to minimize the empirical regularized loss. By incorporating sparsification techniques based on function subspace projections we are able to overcome the computational complexity that standard kernel methods introduce as the data size increases. We prove that the proposed approach is asymptotically optimal for multistage stochastic optimization with side information. Across various computational experiments on stochastic inventory management problems, {our method performs well in multidimensional settings} and remains tractable when the data size is large. Lastly, by computing lower bounds for the optimal loss of the inventory control problem, we show that the proposed method produces decision rules with near-optimal average performance.
Published: 2023

28. The Benefit of Uncertainty Coupling in Robust and Adaptive Robust Optimization

Author: Bertsimas, Dimitris, Na, Liangyuan, Stellato, Bartolomeo, and Wang, Irina
Subjects: Mathematics - Optimization and Control
Abstract: Despite the modeling power for problems under uncertainty, robust optimization (RO) and adaptive robust optimization (ARO) can exhibit too conservative solutions in terms of objective value degradation compared to the nominal case. One of the main reasons behind this conservatism is that, in many practical applications, uncertain constraints are directly designed as constraint-wise without taking into account couplings over multiple constraints. In this paper, we define a coupled uncertainty set as the intersection between a constraint-wise uncertainty set and a coupling set. We study the benefit of coupling in alleviating conservatism in RO and ARO. We provide theoretical tight and computable upper and lower bounds on the objective value improvement of RO and ARO problems under coupled uncertainty over constraint-wise uncertainty. In addition, we relate the power of adaptability over static solutions with the coupling of uncertainty set. Computational results demonstrate the benefit of coupling in applications., Comment: 58 pages, 13 figures
Published: 2023

29. Global Flood Prediction: a Multimodal Machine Learning Approach

Author: Zeng, Cynthia and Bertsimas, Dimitris
Subjects: Computer Science - Machine Learning, Computer Science - Computers and Society
Abstract: Flooding is one of the most destructive and costly natural disasters, and climate changes would further increase risks globally. This work presents a novel multimodal machine learning approach for multi-year global flood risk prediction, combining geographical information and historical natural disaster dataset. Our multimodal framework employs state-of-the-art processing techniques to extract embeddings from each data modality, including text-based geographical data and tabular-based time-series data. Experiments demonstrate that a multimodal approach, that is combining text and statistical data, outperforms a single-modality approach. Our most advanced architecture, employing embeddings extracted using transfer learning upon DistilBert model, achieves 75\%-77\% ROCAUC score in predicting the next 1-5 year flooding event in historically flooded locations. This work demonstrates the potentials of using machine learning for long-term planning in natural disaster management., Comment: 6 pages
Published: 2023

30. Distributionally Robust Causal Inference with Observational Data

Author: Bertsimas, Dimitris, Imai, Kosuke, and Li, Michael Lingzhi
Subjects: Statistics - Methodology, Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: We consider the estimation of average treatment effects in observational studies and propose a new framework of robust causal inference with unobserved confounders. Our approach is based on distributionally robust optimization and proceeds in two steps. We first specify the maximal degree to which the distribution of unobserved potential outcomes may deviate from that of observed outcomes. We then derive sharp bounds on the average treatment effects under this assumption. Our framework encompasses the popular marginal sensitivity model as a special case, and we demonstrate how the proposed methodology can address a primary challenge of the marginal sensitivity model that it produces uninformative results when unobserved confounders substantially affect treatment and outcome. Specifically, we develop an alternative sensitivity model, called the distributional sensitivity model, under the assumption that heterogeneity of treatment effect due to unobserved variables is relatively small. Unlike the marginal sensitivity model, the distributional sensitivity model allows for potential lack of overlap and often produces informative bounds even when unobserved variables substantially affect both treatment and outcome. Finally, we show how to extend the distributional sensitivity model to difference-in-differences designs and settings with instrumental variables. Through simulation and empirical studies, we demonstrate the applicability of the proposed methodology.
Published: 2022

31. Sparse PCA: a Geometric Approach

Author: Bertsimas, Dimitris and Kitane, Driss Lahlou
Subjects: Mathematics - Optimization and Control, Mathematics - Statistics Theory
Abstract: We consider the problem of maximizing the variance explained from a data matrix using orthogonal sparse principal components that have a support of fixed cardinality. While most existing methods focus on building principal components (PCs) iteratively through deflation, we propose GeoSPCA, a novel algorithm to build all PCs at once while satisfying the orthogonality constraints which brings substantial benefits over deflation. This novel approach is based on the left eigenvalues of the covariance matrix which helps circumvent the non-convexity of the problem by approximating the optimal solution using a binary linear optimization problem that can find the optimal solution. The resulting approximation can be used to tackle different versions of the sparse PCA problem including the case in which the principal components share the same support or have disjoint supports and the Structured Sparse PCA problem. We also propose optimality bounds and illustrate the benefits of GeoSPCA in selected real world problems both in terms of explained variance, sparsity and tractability. Improvements vs. the greedy algorithm, which is often at par with state-of-the-art techniques, reaches up to 24% in terms of variance while solving real world problems with 10,000s of variables and support cardinality of 100s in minutes. We also apply GeoSPCA in a face recognition problem yielding more than 10% improvement vs. other PCA based technique such as structured sparse PCA., Comment: 35 pages, 10 figures
Published: 2022

32. Decarbonizing OCP

Author: Bertsimas, Dimitris, Cory-Wright, Ryan, and Digalakis Jr, Vassilis
Subjects: Mathematics - Optimization and Control
Abstract: We present our collaboration with the OCP Group, one of the world's largest producers of phosphate and phosphate-based products, to reduce OCP's carbon emissions significantly. We study the problem of decarbonizing OCP's electricity supply by installing a mixture of solar panels and batteries to minimize its time-discounted investment cost plus the cost of satisfying its remaining demand via the national grid. OCP is currently designing its renewable investment strategy, using insights gleaned from our optimization model, and has pledged to invest 130 billion MAD (approx. 13 billion USD) in a green initiative by 2027, a subset of which involves decarbonization. We immunize our model against deviations between forecast and realized solar generation output via a combination of robust and distributionally robust optimization. To account for variability in daily solar generation, we propose a data-driven robust optimization approach that prevents excessive conservatism by averaging across uncertainty sets. To protect against variability in seasonal weather patterns induced by climate change, we invoke distributionally robust optimization techniques. Under a 10 billion MAD investment by OCP, the proposed methodology reduces the carbon emissions which arise from OCP's energy needs by more than 70% while generating a net present value (NPV) of 5 billion MAD over twenty years. Moreover, a 20 billion MAD investment induces a 95% reduction in carbon emissions and generates an NPV of around 2 billion MAD. To fulfill the Paris climate agreement, rapidly decarbonizing the global economy in a financially sustainable fashion is imperative. Accordingly, this work develops a robust optimization methodology that enables OCP to decarbonize at a profit by purchasing solar panels and batteries. Moreover, the methodology could be applied to decarbonize other industrial consumers., Comment: Submitted to MSOM on 08/2022
Published: 2022

33. Holistic deep learning

Author: Bertsimas, Dimitris, Villalobos Carballo, Kimberly, Boussioux, Léonard, Li, Michael Lingzhi, Paskov, Alex, and Paskov, Ivan
Published: 2024
Full Text: View/download PDF

34. TabText: A Flexible and Contextual Approach to Tabular Data Representation

Author: Carballo, Kimberly Villalobos, Na, Liangyuan, Ma, Yu, Boussioux, Léonard, Zeng, Cynthia, Soenksen, Luis R., and Bertsimas, Dimitris
Subjects: Computer Science - Machine Learning
Abstract: Tabular data is essential for applying machine learning tasks across various industries. However, traditional data processing methods do not fully utilize all the information available in the tables, ignoring important contextual information such as column header descriptions. In addition, pre-processing data into a tabular format can remain a labor-intensive bottleneck in model development. This work introduces TabText, a processing and feature extraction framework that extracts contextual information from tabular data structures. TabText addresses processing difficulties by converting the content into language and utilizing pre-trained large language models (LLMs). We evaluate our framework on nine healthcare prediction tasks ranging from patient discharge, ICU admission, and mortality. We show that 1) applying our TabText framework enables the generation of high-performing and simple machine learning baseline models with minimal data pre-processing, and 2) augmenting pre-processed tabular data with TabText representations improves the average and worst-case AUC performance of standard machine learning models by as much as 6%.
Published: 2022

35. Learning Sparse Nonlinear Dynamics via Mixed-Integer Optimization

Author: Bertsimas, Dimitris and Gurnee, Wes
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Systems and Control, Mathematics - Optimization and Control
Abstract: Discovering governing equations of complex dynamical systems directly from data is a central problem in scientific machine learning. In recent years, the sparse identification of nonlinear dynamics (SINDy) framework, powered by heuristic sparse regression methods, has become a dominant tool for learning parsimonious models. We propose an exact formulation of the SINDy problem using mixed-integer optimization (MIO) to solve the sparsity constrained regression problem to provable optimality in seconds. On a large number of canonical ordinary and partial differential equations, we illustrate the dramatic improvement of our approach in accurate model discovery while being more sample efficient, robust to noise, and flexible in accommodating physical constraints.
Published: 2022

36. A machine learning approach to two-stage adaptive robust optimization

Author: Bertsimas, Dimitris and Kim, Cheol Woo
Published: 2024
Full Text: View/download PDF

37. Integrated multimodal artificial intelligence framework for healthcare applications

Author: Soenksen, Luis R., Ma, Yu, Zeng, Cynthia, Boussioux, Leonard D. J., Carballo, Kimberly Villalobos, Na, Liangyuan, Wiberg, Holly M., Li, Michael L., Fuentes, Ignacio, and Bertsimas, Dimitris
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: Artificial intelligence (AI) systems hold great promise to improve healthcare over the next decades. Specifically, AI systems leveraging multiple data sources and input modalities are poised to become a viable method to deliver more accurate results and deployable pipelines across a wide range of applications. In this work, we propose and evaluate a unified Holistic AI in Medicine (HAIM) framework to facilitate the generation and testing of AI systems that leverage multimodal inputs. Our approach uses generalizable data pre-processing and machine learning modeling stages that can be readily adapted for research and deployment in healthcare environments. We evaluate our HAIM framework by training and characterizing 14,324 independent models based on HAIM-MIMIC-MM, a multimodal clinical database (N=34,537 samples) containing 7,279 unique hospitalizations and 6,485 patients, spanning all possible input combinations of 4 data modalities (i.e., tabular, time-series, text, and images), 11 unique data sources and 12 predictive tasks. We show that this framework can consistently and robustly produce models that outperform similar single-source approaches across various healthcare demonstrations (by 6-33%), including 10 distinct chest pathology diagnoses, along with length-of-stay and 48-hour mortality predictions. We also quantify the contribution of each modality and data source using Shapley values, which demonstrates the heterogeneity in data modality importance and the necessity of multimodal inputs across different healthcare-relevant tasks. The generalizable properties and flexibility of our Holistic AI in Medicine (HAIM) framework could offer a promising pathway for future multimodal predictive systems in clinical and operational healthcare settings.
Published: 2022
Full Text: View/download PDF

38. Global Optimization via Optimal Decision Trees

Author: Bertsimas, Dimitris and Öztürk, Berk
Subjects: Mathematics - Optimization and Control
Abstract: The global optimization literature places large emphasis on reducing intractable optimization problems into more tractable structured optimization forms. In order to achieve this goal, many existing methods are restricted to optimization over explicit constraints and objectives that use a subset of possible mathematical primitives. These are limiting in real-world contexts where more general explicit and black box constraints appear. Leveraging the dramatic speed improvements in mixed-integer optimization (MIO) and recent research in machine learning, we propose a new method to learn MIO-compatible approximations of global optimization problems using optimal decision trees with hyperplanes (OCT-Hs). This constraint learning approach only requires a bounded variable domain, and can address both explicit and inexplicit constraints. We solve the MIO approximation efficiently to find a near-optimal, near-feasible solution to the global optimization problem. We further improve the solution using a series of projected gradient descent iterations. We test the method on a number of numerical benchmarks from the literature as well as real-world design problems, demonstrating its promise in finding global optima efficiently., Comment: 52 pages, 9 figures, 10 tables. Submitted to Operations Research
Published: 2022

39. A new perspective on low-rank optimization

Author: Bertsimas, Dimitris, Cory-Wright, Ryan, and Pauphilet, Jean
Published: 2023
Full Text: View/download PDF

40. Tensor completion with noisy side information

Author: Bertsimas, Dimitris and Pawlowski, Colin
Published: 2023
Full Text: View/download PDF

41. Robust Upper Bounds for Adversarial Training

Author: Bertsimas, Dimitris, Boix, Xavier, Carballo, Kimberly Villalobos, and Hertog, Dick den
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Many state-of-the-art adversarial training methods for deep learning leverage upper bounds of the adversarial loss to provide security guarantees against adversarial attacks. Yet, these methods rely on convex relaxations to propagate lower and upper bounds for intermediate layers, which affect the tightness of the bound at the output layer. We introduce a new approach to adversarial training by minimizing an upper bound of the adversarial loss that is based on a holistic expansion of the network instead of separate bounds for each layer. This bound is facilitated by state-of-the-art tools from Robust Optimization; it has closed-form and can be effectively trained using backpropagation. We derive two new methods with the proposed approach. The first method (Approximated Robust Upper Bound or aRUB) uses the first order approximation of the network as well as basic tools from Linear Robust Optimization to obtain an empirical upper bound of the adversarial loss that can be easily implemented. The second method (Robust Upper Bound or RUB), computes a provable upper bound of the adversarial loss. Across a variety of tabular and vision data sets we demonstrate the effectiveness of our approach -- RUB is substantially more robust than state-of-the-art methods for larger perturbations, while aRUB matches the performance of state-of-the-art methods for small perturbations.
Published: 2021

42. Mixed-Integer Optimization with Constraint Learning

Author: Maragno, Donato, Wiberg, Holly, Bertsimas, Dimitris, Birbil, S. Ilker, Hertog, Dick den, and Fajemisin, Adejuyigbe
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We establish a broad methodological foundation for mixed-integer optimization with learned constraints. We propose an end-to-end pipeline for data-driven decision making in which constraints and objectives are directly learned from data using machine learning, and the trained models are embedded in an optimization formulation. We exploit the mixed-integer optimization-representability of many machine learning methods, including linear models, decision trees, ensembles, and multi-layer perceptrons, which allows us to capture various underlying relationships between decisions, contextual variables, and outcomes. We also introduce two approaches for handling the inherent uncertainty of learning from data. First, we characterize a decision trust region using the convex hull of the observations, to ensure credible recommendations and avoid extrapolation. We efficiently incorporate this representation using column generation and propose a more flexible formulation to deal with low-density regions and high-dimensional datasets. Then, we propose an ensemble learning approach that enforces constraint satisfaction over multiple bootstrapped estimators or multiple algorithms. In combination with domain-driven components, the embedded models and trust region define a mixed-integer optimization problem for prescription generation. We implement this framework as a Python package (OptiCL) for practitioners. We demonstrate the method in both World Food Programme planning and chemotherapy optimization. The case studies illustrate the framework's ability to generate high-quality prescriptions as well as the value added by the trust region, the use of ensembles to control model robustness, the consideration of multiple machine learning methods, and the inclusion of multiple learned constraints.
Published: 2021

43. Holistic Deep Learning

Author: Bertsimas, Dimitris, Carballo, Kimberly Villalobos, Boussioux, Léonard, Li, Michael Lingzhi, Paskov, Alex, and Paskov, Ivan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: This paper presents a novel holistic deep learning framework that simultaneously addresses the challenges of vulnerability to input perturbations, overparametrization, and performance instability from different train-validation splits. The proposed framework holistically improves accuracy, robustness, sparsity, and stability over standard deep learning models, as demonstrated by extensive experiments on both tabular and image data sets. The results are further validated by ablation experiments and SHAP value analysis, which reveal the interactions and trade-offs between the different evaluation metrics. To support practitioners applying our framework, we provide a prescriptive approach that offers recommendations for selecting an appropriate training loss function based on their specific objectives. All the code to reproduce the results can be found at https://github.com/kimvc7/HDL., Comment: Under review at Machine Learning
Published: 2021

44. Sparse Plus Low Rank Matrix Decomposition: A Discrete Optimization Approach

Author: Bertsimas, Dimitris, Cory-Wright, Ryan, and Johnson, Nicholas A. G.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: We study the Sparse Plus Low-Rank decomposition problem (SLR), which is the problem of decomposing a corrupted data matrix into a sparse matrix of perturbations plus a low-rank matrix containing the ground truth. SLR is a fundamental problem in Operations Research and Machine Learning which arises in various applications, including data compression, latent semantic indexing, collaborative filtering, and medical imaging. We introduce a novel formulation for SLR that directly models its underlying discreteness. For this formulation, we develop an alternating minimization heuristic that computes high-quality solutions and a novel semidefinite relaxation that provides meaningful bounds for the solutions returned by our heuristic. We also develop a custom branch-and-bound algorithm that leverages our heuristic and convex relaxations to solve small instances of SLR to certifiable (near) optimality. Given an input $n$-by-$n$ matrix, our heuristic scales to solve instances where $n=10000$ in minutes, our relaxation scales to instances where $n=200$ in hours, and our branch-and-bound algorithm scales to instances where $n=25$ in minutes. Our numerical results demonstrate that our approach outperforms existing state-of-the-art approaches in terms of rank, sparsity, and mean-square error while maintaining a comparable runtime.
Published: 2021

45. Interpretable artificial intelligence to optimise use of imatinib after resection in patients with localised gastrointestinal stromal tumours: an observational cohort study

Author: Bertsimas, Dimitris, Margonis, Georgios Antonios, Sujichantararat, Suleeporn, Koulouras, Angelos, Ma, Yu, Antonescu, Cristina R, Brennan, Murray F, Martín-Broto, Javier, Tang, Seehanah, Rutkowski, Piotr, Kreis, Martin E, Beyer, Katharina, Wang, Jane, Bylina, Elzbieta, Sobczuk, Pawel, Gutierrez, Antonio, Jadeja, Bhumika, Tap, William D, Chi, Ping, and Singer, Samuel
Published: 2024
Full Text: View/download PDF

46. Congenital Heart Surgery Machine Learning-Derived In-Depth Benchmarking Tool

Author: Sarris, George E., Zhuo, Daisy, Mingardi, Luca, Dunn, Jack, Levine, Jordan, Tobota, Zdzislaw, Maruszewski, Bohdan, Fragata, Jose, and Bertsimas, Dimitris
Published: 2024
Full Text: View/download PDF

47. The Price of Diversity

Author: Bandi, Hari and Bertsimas, Dimitris
Subjects: Computer Science - Computers and Society, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Systemic bias with respect to gender, race and ethnicity, often unconscious, is prevalent in datasets involving choices among individuals. Consequently, society has found it challenging to alleviate bias and achieve diversity in a way that maintains meritocracy in such settings. We propose (a) a novel optimization approach based on optimally flipping outcome labels and training classification models simultaneously to discover changes to be made in the selection process so as to achieve diversity without significantly affecting meritocracy, and (b) a novel implementation tool employing optimal classification trees to provide insights on which attributes of individuals lead to flipping of their labels, and to help make changes in the current selection processes in a manner understandable by human decision makers. We present case studies on three real-world datasets consisting of parole, admissions to the bar and lending decisions, and demonstrate that the price of diversity is low and sometimes negative, that is we can modify our selection processes in a way that enhances diversity without affecting meritocracy significantly, and sometimes improving it.
Published: 2021

48. Algorithmic Insurance

Author: Bertsimas, Dimitris and Orfanoudaki, Agni
Subjects: Computer Science - Machine Learning, Quantitative Finance - Risk Management, Statistics - Machine Learning
Abstract: As machine learning algorithms start to get integrated into the decision-making process of companies and organizations, insurance products are being developed to protect their owners from liability risk. Algorithmic liability differs from human liability since it is based on a single model compared to multiple heterogeneous decision-makers and its performance is known a priori for a given set of data. Traditional actuarial tools for human liability do not take these properties into consideration, primarily focusing on the distribution of historical claims. We propose, for the first time, a quantitative framework to estimate the risk exposure of insurance contracts for machine-driven liability, introducing the concept of algorithmic insurance. Specifically, we present an optimization formulation to estimate the risk exposure of a binary classification model given a pre-defined range of premiums. We adjust the formulation to account for uncertainty in the resulting losses using robust optimization. Our approach outlines how properties of the model, such as accuracy, interpretability, and generalizability, can influence the insurance contract evaluation. To showcase a practical implementation of the proposed framework, we present a case study of medical malpractice in the context of breast cancer detection. Our analysis focuses on measuring the effect of the model parameters on the expected financial loss and identifying the aspects of algorithmic performance that predominantly affect the risk of the contract.
Published: 2021

49. Global optimization via optimal decision trees

Author: Bertsimas, Dimitris and Öztürk, Berk
Published: 2023
Full Text: View/download PDF

50. Robust convex optimization: A new perspective that unifies and extends

Author: Bertsimas, Dimitris, Hertog, Dick den, Pauphilet, Jean, and Zhen, Jianzhe
Published: 2023
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Category

Publication Type

Journal

Region

Database

Publisher

1,679 results on '"Bertsimas, Dimitris"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources