Author: "Cresswell, Jesse C." / Search Limiters: Academic (Peer-Reviewed) Journals - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Cresswell, Jesse C."' showing total 42 results

Start Over Author "Cresswell, Jesse C." Search Limiters Academic (Peer-Reviewed) Journals

42 results on '"Cresswell, Jesse C."'

1. A Geometric Framework for Understanding Memorization in Generative Models

Author: Ross, Brendan Leigh, Kamkari, Hamidreza, Wu, Tongzi, Hosseinzadeh, Rasa, Liu, Zhaoyan, Stein, George, Cresswell, Jesse C., and Loaiza-Ganem, Gabriel
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: As deep generative models have progressed, recent work has shown them to be capable of memorizing and reproducing training datapoints when deployed. These findings call into question the usability of generative models, especially in light of the legal and privacy risks brought about by memorization. To better understand this phenomenon, we propose the manifold memorization hypothesis (MMH), a geometric framework which leverages the manifold hypothesis into a clear language in which to reason about memorization. We propose to analyze memorization in terms of the relationship between the dimensionalities of $(i)$ the ground truth data manifold and $(ii)$ the manifold learned by the model. This framework provides a formal standard for "how memorized" a datapoint is and systematically categorizes memorized data into two types: memorization driven by overfitting and memorization driven by the underlying data distribution. By analyzing prior work in the context of the MMH, we explain and unify assorted observations in the literature. We empirically validate the MMH using synthetic data and image datasets up to the scale of Stable Diffusion, developing new tools for detecting and preventing generation of memorized samples in the process., Comment: 10 pages, 7 figures
Published: 2024

2. CaloChallenge 2022: A Community Challenge for Fast Calorimeter Simulation

Author: Krause, Claudius, Giannelli, Michele Faucci, Kasieczka, Gregor, Nachman, Benjamin, Salamani, Dalila, Shih, David, Zaborowska, Anna, Amram, Oz, Borras, Kerstin, Buckley, Matthew R., Buhmann, Erik, Buss, Thorsten, Cardoso, Renato Paulo Da Costa, Caterini, Anthony L., Chernyavskaya, Nadezda, Corchia, Federico A. G., Cresswell, Jesse C., Diefenbacher, Sascha, Dreyer, Etienne, Ekambaram, Vijay, Eren, Engin, Ernst, Florian, Favaro, Luigi, Franchini, Matteo, Gaede, Frank, Gross, Eilam, Hsu, Shih-Chieh, Jaruskova, Kristina, Käch, Benno, Kalagnanam, Jayant, Kansal, Raghav, Kim, Taewoo, Kobylianskii, Dmitrii, Korol, Anatolii, Korcari, William, Krücker, Dirk, Krüger, Katja, Letizia, Marco, Li, Shu, Liu, Qibin, Liu, Xiulong, Loaiza-Ganem, Gabriel, Madula, Thandikire, McKeown, Peter, Melzer-Pellmann, Isabell-A., Mikuni, Vinicius, Nguyen, Nam, Ore, Ayodele, Schweitzer, Sofia Palacios, Pang, Ian, Pedro, Kevin, Plehn, Tilman, Pokorski, Witold, Qu, Huilin, Raikwar, Piyush, Raine, John A., Reyes-Gonzalez, Humberto, Rinaldi, Lorenzo, Ross, Brendan Leigh, Scham, Moritz A. W., Schnake, Simon, Shimmin, Chase, Shlizerman, Eli, Soybelman, Nathalie, Srivatsa, Mudhakar, Tsolaki, Kalliopi, Vallecorsa, Sofia, Yeo, Kyongmin, and Zhang, Rui
Subjects: Computer Science - Machine Learning, High Energy Physics - Experiment, High Energy Physics - Phenomenology, Physics - Instrumentation and Detectors
Abstract: We present the results of the "Fast Calorimeter Simulation Challenge 2022" - the CaloChallenge. We study state-of-the-art generative models on four calorimeter shower datasets of increasing dimensionality, ranging from a few hundred voxels to a few tens of thousand voxels. The 31 individual submissions span a wide range of current popular generative architectures, including Variational AutoEncoders (VAEs), Generative Adversarial Networks (GANs), Normalizing Flows, Diffusion models, and models based on Conditional Flow Matching. We compare all submissions in terms of quality of generated calorimeter showers, as well as shower generation time and model size. To assess the quality we use a broad range of different metrics including differences in 1-dimensional histograms of observables, KPD/FPD scores, AUCs of binary classifiers, and the log-posterior of a multiclass classifier. The results of the CaloChallenge provide the most complete and comprehensive survey of cutting-edge approaches to calorimeter fast simulation to date. In addition, our work provides a uniquely detailed perspective on the important problem of how to evaluate generative models. As such, the results presented here should be applicable for other domains that use generative AI and require fast and faithful generation of samples in a large phase space., Comment: 204 pages, 100+ figures, 30+ tables
Published: 2024

3. TabDPT: Scaling Tabular Foundation Models

Author: Ma, Junwei, Thomas, Valentin, Hosseinzadeh, Rasa, Kamkari, Hamidreza, Labach, Alex, Cresswell, Jesse C., Golestan, Keyvan, Yu, Guangwei, Volkovs, Maksims, and Caterini, Anthony L.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: The challenges faced by neural networks on tabular data are well-documented and have hampered the progress of tabular foundation models. Techniques leveraging in-context learning (ICL) have shown promise here, allowing for dynamic adaptation to unseen data. ICL can provide predictions for entirely new datasets without further training or hyperparameter tuning, therefore providing very fast inference when encountering a novel task. However, scaling ICL for tabular data remains an issue: approaches based on large language models cannot efficiently process numeric tables, and tabular-specific techniques have not been able to effectively harness the power of real data to improve performance and generalization. We are able to overcome these challenges by training tabular-specific ICL-based architectures on real data with self-supervised learning and retrieval, combining the best of both worlds. Our resulting model -- the Tabular Discriminative Pre-trained Transformer (TabDPT) -- achieves state-of-the-art performance on the CC18 (classification) and CTR23 (regression) benchmarks with no task-specific fine-tuning, demonstrating the adapatability and speed of ICL once the model is pre-trained. TabDPT also demonstrates strong scaling as both model size and amount of available data increase, pointing towards future improvements simply through the curation of larger tabular pre-training datasets and training larger models., Comment: Minimal TabDPT interface to provide predictions on new datasets available at the following link: https://github.com/layer6ai-labs/TabDPT
Published: 2024

4. MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation

Author: Gorti, Satya Krishna, Gofman, Ilan, Liu, Zhaoyan, Wu, Jiapeng, Vouitsis, Noël, Yu, Guangwei, Cresswell, Jesse C., and Hosseinzadeh, Rasa
Subjects: Computer Science - Computation and Language
Abstract: Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these issues, we focus on developing small, efficient, and open-source text-to-SQL models. We demonstrate the benefits of sampling multiple candidate SQL generations and propose our method, MSc-SQL, to critique them using associated metadata. Our sample critiquing model evaluates multiple outputs simultaneously, achieving state-of-the-art performance compared to other open-source models while remaining competitive with larger models at a much lower cost. Full code can be found at github.com/layer6ai-labs/msc-sql., Comment: 3rd Table Representation Learning Workshop at NeurIPS 2024
Published: 2024

5. Conformal Prediction Sets Can Cause Disparate Impact

Author: Cresswell, Jesse C., Kumar, Bhargava, Sui, Yi, and Belbahri, Mouloud
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Although conformal prediction is a promising method for quantifying the uncertainty of machine learning models, the prediction sets it outputs are not inherently actionable. Many applications require a single output to act on, not several. To overcome this, prediction sets can be provided to a human who then makes an informed decision. In any such system it is crucial to ensure the fairness of outcomes across protected groups, and researchers have proposed that Equalized Coverage be used as the standard for fairness. By conducting experiments with human participants, we demonstrate that providing prediction sets can increase the unfairness of their decisions. Disquietingly, we find that providing sets that satisfy Equalized Coverage actually increases unfairness compared to marginal coverage. Instead of equalizing coverage, we propose to equalize set sizes across groups which empirically leads to more fair outcomes., Comment: Code is available at https://github.com/layer6ai-labs/conformal-prediction-fairness
Published: 2024

6. Scaling Up Diffusion and Flow-based XGBoost Models

Author: Cresswell, Jesse C. and Kim, Taewoo
Subjects: Computer Science - Machine Learning
Abstract: Novel machine learning methods for tabular data generation are often developed on small datasets which do not match the scale required for scientific applications. We investigate a recent proposal to use XGBoost as the function approximator in diffusion and flow-matching models on tabular data, which proved to be extremely memory intensive, even on tiny datasets. In this work, we conduct a critical analysis of the existing implementation from an engineering perspective, and show that these limitations are not fundamental to the method; with better implementation it can be scaled to datasets 370x larger than previously used. Our efficient implementation also unlocks scaling models to much larger sizes which we show directly leads to improved performance on benchmark tasks. We also propose algorithmic improvements that can further benefit resource usage and model performance, including multi-output trees which are well-suited to generative modeling. Finally, we present results on large-scale scientific datasets derived from experimental particle physics as part of the Fast Calorimeter Simulation Challenge. Code is available at https://github.com/layer6ai-labs/calo-forest., Comment: Presented at ICML 2024 Workshop on AI for Science
Published: 2024

7. Benchmarking Robust Self-Supervised Learning Across Diverse Downstream Tasks

Author: Kowalczuk, Antoni, Dubiński, Jan, Ghomi, Atiyeh Ashari, Sui, Yi, Stein, George, Wu, Jiapeng, Cresswell, Jesse C., Boenisch, Franziska, and Dziedzic, Adam
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Large-scale vision models have become integral in many applications due to their unprecedented performance and versatility across downstream tasks. However, the robustness of these foundation models has primarily been explored for a single task, namely image classification. The vulnerability of other common vision tasks, such as semantic segmentation and depth estimation, remains largely unknown. We present a comprehensive empirical evaluation of the adversarial robustness of self-supervised vision encoders across multiple downstream tasks. Our attacks operate in the encoder embedding space and at the downstream task output level. In both cases, current state-of-the-art adversarial fine-tuning techniques tested only for classification significantly degrade clean and robust performance on other tasks. Since the purpose of a foundation model is to cater to multiple applications at once, our findings reveal the need to enhance encoder robustness more broadly. Our code is available at ${github.com/layer6ai-labs/ssl-robustness}$., Comment: Accepted at the ICML 2024 Workshop on Foundation Models in the Wild
Published: 2024

8. A Geometric View of Data Complexity: Efficient Local Intrinsic Dimension Estimation with Diffusion Models

Author: Kamkari, Hamidreza, Ross, Brendan Leigh, Hosseinzadeh, Rasa, Cresswell, Jesse C., and Loaiza-Ganem, Gabriel
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: High-dimensional data commonly lies on low-dimensional submanifolds, and estimating the local intrinsic dimension (LID) of a datum -- i.e. the dimension of the submanifold it belongs to -- is a longstanding problem. LID can be understood as the number of local factors of variation: the more factors of variation a datum has, the more complex it tends to be. Estimating this quantity has proven useful in contexts ranging from generalization in neural networks to detection of out-of-distribution data, adversarial examples, and AI-generated text. The recent successes of deep generative models present an opportunity to leverage them for LID estimation, but current methods based on generative models produce inaccurate estimates, require more than a single pre-trained model, are computationally intensive, or do not exploit the best available deep generative models: diffusion models (DMs). In this work, we show that the Fokker-Planck equation associated with a DM can provide an LID estimator which addresses the aforementioned deficiencies. Our estimator, called FLIPD, is easy to implement and compatible with all popular DMs. Applying FLIPD to synthetic LID estimation benchmarks, we find that DMs implemented as fully-connected networks are highly effective LID estimators that outperform existing baselines. We also apply FLIPD to natural images where the true LID is unknown. Despite being sensitive to the choice of network architecture, FLIPD estimates remain a useful measure of relative complexity; compared to competing estimators, FLIPD exhibits a consistently higher correlation with image PNG compression rate and better aligns with qualitative assessments of complexity. Notably, FLIPD is orders of magnitude faster than other LID estimators, and the first to be tractable at the scale of Stable Diffusion., Comment: NeurIPS 2024 (spotlight)
Published: 2024

9. Deep Generative Models through the Lens of the Manifold Hypothesis: A Survey and New Connections

Author: Loaiza-Ganem, Gabriel, Ross, Brendan Leigh, Hosseinzadeh, Rasa, Caterini, Anthony L., and Cresswell, Jesse C.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: In recent years there has been increased interest in understanding the interplay between deep generative models (DGMs) and the manifold hypothesis. Research in this area focuses on understanding the reasons why commonly-used DGMs succeed or fail at learning distributions supported on unknown low-dimensional manifolds, as well as developing new models explicitly designed to account for manifold-supported data. This manifold lens provides both clarity as to why some DGMs (e.g. diffusion models and some generative adversarial networks) empirically surpass others (e.g. likelihood-based models such as variational autoencoders, normalizing flows, or energy-based models) at sample generation, and guidance for devising more performant DGMs. We carry out the first survey of DGMs viewed through this lens, making two novel contributions along the way. First, we formally establish that numerical instability of likelihoods in high ambient dimensions is unavoidable when modelling data with low intrinsic dimension. We then show that DGMs on learned representations of autoencoders can be interpreted as approximately minimizing Wasserstein distance: this result, which applies to latent diffusion models, helps justify their outstanding empirical results. The manifold lens provides a rich perspective from which to understand DGMs, and we aim to make this perspective more accessible and widespread., Comment: TMLR 2024 (survey certification, expert certification)
Published: 2024

10. A Geometric Explanation of the Likelihood OOD Detection Paradox

Author: Kamkari, Hamidreza, Ross, Brendan Leigh, Cresswell, Jesse C., Caterini, Anthony L., Krishnan, Rahul G., and Loaiza-Ganem, Gabriel
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: Likelihood-based deep generative models (DGMs) commonly exhibit a puzzling behaviour: when trained on a relatively complex dataset, they assign higher likelihood values to out-of-distribution (OOD) data from simpler sources. Adding to the mystery, OOD samples are never generated by these DGMs despite having higher likelihoods. This two-pronged paradox has yet to be conclusively explained, making likelihood-based OOD detection unreliable. Our primary observation is that high-likelihood regions will not be generated if they contain minimal probability mass. We demonstrate how this seeming contradiction of large densities yet low probability mass can occur around data confined to low-dimensional manifolds. We also show that this scenario can be identified through local intrinsic dimension (LID) estimation, and propose a method for OOD detection which pairs the likelihoods and LID estimates obtained from a pre-trained DGM. Our method can be applied to normalizing flows and score-based diffusion models, and obtains results which match or surpass state-of-the-art OOD detection benchmarks using the same DGM backbones. Our code is available at https://github.com/layer6ai-labs/dgm_ood_detection., Comment: ICML 2024
Published: 2024

11. Conformal Prediction Sets Improve Human Decision Making

Author: Cresswell, Jesse C., Sui, Yi, Kumar, Bhargava, and Vouitsis, Noël
Subjects: Computer Science - Machine Learning, Computer Science - Human-Computer Interaction, Statistics - Machine Learning
Abstract: In response to everyday queries, humans explicitly signal uncertainty and offer alternative answers when they are unsure. Machine learning models that output calibrated prediction sets through conformal prediction mimic this human behaviour; larger sets signal greater uncertainty while providing alternatives. In this work, we study the usefulness of conformal prediction sets as an aid for human decision making by conducting a pre-registered randomized controlled trial with conformal prediction sets provided to human subjects. With statistical significance, we find that when humans are given conformal prediction sets their accuracy on tasks improves compared to fixed-size prediction sets with the same coverage guarantee. The results show that quantifying model uncertainty with conformal prediction is helpful for human-in-the-loop decision making and human-AI teams., Comment: Published at ICML 2024. Code available at https://github.com/layer6ai-labs/hitl-conformal-prediction
Published: 2024

12. Data-Efficient Multimodal Fusion on a Single GPU

Author: Vouitsis, Noël, Liu, Zhaoyan, Gorti, Satya Krishna, Villecroze, Valentin, Cresswell, Jesse C., Yu, Guangwei, Loaiza-Ganem, Gabriel, and Volkovs, Maksims
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: The goal of multimodal alignment is to learn a single latent space that is shared between multimodal inputs. The most powerful models in this space have been trained using massive datasets of paired inputs and large-scale computational resources, making them prohibitively expensive to train in many practical scenarios. We surmise that existing unimodal encoders pre-trained on large amounts of unimodal data should provide an effective bootstrap to create multimodal models from unimodal ones at much lower costs. We therefore propose FuseMix, a multimodal augmentation scheme that operates on the latent spaces of arbitrary pre-trained unimodal encoders. Using FuseMix for multimodal alignment, we achieve competitive performance -- and in certain cases outperform state-of-the art methods -- in both image-text and audio-text retrieval, with orders of magnitude less compute and data: for example, we outperform CLIP on the Flickr30K text-to-image retrieval task with $\sim \! 600\times$ fewer GPU days and $\sim \! 80\times$ fewer image-text pairs. Additionally, we show how our method can be applied to convert pre-trained text-to-image generative models into audio-to-image ones. Code is available at: https://github.com/layer6ai-labs/fusemix., Comment: CVPR 2024 (Highlight)
Published: 2023

13. Self-supervised Representation Learning From Random Data Projectors

Author: Sui, Yi, Wu, Tongzi, Cresswell, Jesse C., Wu, Ga, Stein, George, Huang, Xiao Shi, Zhang, Xiaochen, and Volkovs, Maksims
Subjects: Computer Science - Machine Learning
Abstract: Self-supervised representation learning~(SSRL) has advanced considerably by exploiting the transformation invariance assumption under artificially designed data augmentations. While augmentation-based SSRL algorithms push the boundaries of performance in computer vision and natural language processing, they are often not directly applicable to other data modalities, and can conflict with application-specific data augmentation constraints. This paper presents an SSRL approach that can be applied to any data modality and network architecture because it does not rely on augmentations or masking. Specifically, we show that high-quality data representations can be learned by reconstructing random data projections. We evaluate the proposed approach on a wide range of representation learning tasks that span diverse modalities and real-world applications. We show that it outperforms multiple state-of-the-art SSRL baselines. Due to its wide applicability and strong empirical results, we argue that learning from randomness is a fruitful research direction worthy of attention and further study., Comment: Published as a conference paper of ICLR 2024. https://openreview.net/pdf?id=EpYnZpDpsQ
Published: 2023

14. Augment then Smooth: Reconciling Differential Privacy with Certified Robustness

Author: Wu, Jiapeng, Ghomi, Atiyeh Ashari, Glukhov, David, Cresswell, Jesse C., Boenisch, Franziska, and Papernot, Nicolas
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Machine learning models are susceptible to a variety of attacks that can erode trust, including attacks against the privacy of training data, and adversarial examples that jeopardize model accuracy. Differential privacy and certified robustness are effective frameworks for combating these two threats respectively, as they each provide future-proof guarantees. However, we show that standard differentially private model training is insufficient for providing strong certified robustness guarantees. Indeed, combining differential privacy and certified robustness in a single system is non-trivial, leading previous works to introduce complex training schemes that lack flexibility. In this work, we present DP-CERT, a simple and effective method that achieves both privacy and robustness guarantees simultaneously by integrating randomized smoothing into standard differentially private model training. Compared to the leading prior work, DP-CERT gives up to a 2.5% increase in certified accuracy for the same differential privacy guarantee on CIFAR10. Through in-depth persample metric analysis, we find that larger certifiable radii correlate with smaller local Lipschitz constants, and show that DP-CERT effectively reduces Lipschitz constants compared to other differentially private training methods. The code is available at github.com/layer6ailabs/dp-cert., Comment: 29 pages, 19 figures. Accepted at TMLR in 2024. Link: https://openreview.net/pdf?id=YN0IcnXqsr
Published: 2023

15. Exposing flaws of generative model evaluation metrics and their unfair treatment of diffusion models

Author: Stein, George, Cresswell, Jesse C., Hosseinzadeh, Rasa, Sui, Yi, Ross, Brendan Leigh, Villecroze, Valentin, Liu, Zhaoyan, Caterini, Anthony L., Taylor, J. Eric T., and Loaiza-Ganem, Gabriel
Subjects: Computer Science - Machine Learning, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning
Abstract: We systematically study a wide variety of generative models spanning semantically-diverse image datasets to understand and improve the feature extractors and metrics used to evaluate them. Using best practices in psychophysics, we measure human perception of image realism for generated samples by conducting the largest experiment evaluating generative models to date, and find that no existing metric strongly correlates with human evaluations. Comparing to 17 modern metrics for evaluating the overall performance, fidelity, diversity, rarity, and memorization of generative models, we find that the state-of-the-art perceptual realism of diffusion models as judged by humans is not reflected in commonly reported metrics such as FID. This discrepancy is not explained by diversity in generated samples, though one cause is over-reliance on Inception-V3. We address these flaws through a study of alternative self-supervised feature extractors, find that the semantic information encoded by individual networks strongly depends on their training procedure, and show that DINOv2-ViT-L/14 allows for much richer evaluation of generative models. Next, we investigate data memorization, and find that generative models do memorize training examples on simple, smaller datasets like CIFAR10, but not necessarily on more complex datasets like ImageNet. However, our experiments show that current metrics do not properly detect memorization: none in the literature is able to separate memorization from other phenomena such as underfitting or mode shrinkage. To facilitate further development of generative models and their evaluation we release all generated image datasets, human evaluation data, and a modular library to compute 17 common metrics for 9 different encoders at https://github.com/layer6ai-labs/dgm-eval., Comment: NeurIPS 2023. 53 pages, 29 figures, 12 tables. Code at https://github.com/layer6ai-labs/dgm-eval, reviews at https://openreview.net/forum?id=08zf7kTOoh
Published: 2023

16. Denoising Deep Generative Models

Author: Loaiza-Ganem, Gabriel, Ross, Brendan Leigh, Wu, Luhuan, Cunningham, John P., Cresswell, Jesse C., and Caterini, Anthony L.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence
Abstract: Likelihood-based deep generative models have recently been shown to exhibit pathological behaviour under the manifold hypothesis as a consequence of using high-dimensional densities to model data with low-dimensional structure. In this paper we propose two methodologies aimed at addressing this problem. Both are based on adding Gaussian noise to the data to remove the dimensionality mismatch during training, and both provide a denoising mechanism whose goal is to sample from the model as though no noise had been added to the data. Our first approach is based on Tweedie's formula, and the second on models which take the variance of added noise as a conditional input. We show that surprisingly, while well motivated, these approaches only sporadically improve performance over not adding noise, and that other methods of addressing the dimensionality mismatch are more empirically adequate., Comment: NeurIPS 2022 ICBINB workshop (spotlight)
Published: 2022

17. CaloMan: Fast generation of calorimeter showers with density estimation on learned manifolds

Author: Cresswell, Jesse C., Ross, Brendan Leigh, Loaiza-Ganem, Gabriel, Reyes-Gonzalez, Humberto, Letizia, Marco, and Caterini, Anthony L.
Subjects: High Energy Physics - Phenomenology, Computer Science - Machine Learning, High Energy Physics - Experiment, Physics - Data Analysis, Statistics and Probability, Physics - Instrumentation and Detectors
Abstract: Precision measurements and new physics searches at the Large Hadron Collider require efficient simulations of particle propagation and interactions within the detectors. The most computationally expensive simulations involve calorimeter showers. Advances in deep generative modelling - particularly in the realm of high-dimensional data - have opened the possibility of generating realistic calorimeter showers orders of magnitude more quickly than physics-based simulation. However, the high-dimensional representation of showers belies the relative simplicity and structure of the underlying physical laws. This phenomenon is yet another example of the manifold hypothesis from machine learning, which states that high-dimensional data is supported on low-dimensional manifolds. We thus propose modelling calorimeter showers first by learning their manifold structure, and then estimating the density of data across this manifold. Learning manifold structure reduces the dimensionality of the data, which enables fast training and generation when compared with competing methods., Comment: Accepted to the Machine Learning and the Physical Sciences Workshop at NeurIPS 2022
Published: 2022

18. Find Your Friends: Personalized Federated Learning with the Right Collaborators

Author: Sui, Yi, Wen, Junfeng, Lau, Yenson, Ross, Brendan Leigh, and Cresswell, Jesse C.
Subjects: Computer Science - Machine Learning
Abstract: In the traditional federated learning setting, a central server coordinates a network of clients to train one global model. However, the global model may serve many clients poorly due to data heterogeneity. Moreover, there may not exist a trusted central party that can coordinate the clients to ensure that each of them can benefit from others. To address these concerns, we present a novel decentralized framework, FedeRiCo, where each client can learn as much or as little from other clients as is optimal for its local data distribution. Based on expectation-maximization, FedeRiCo estimates the utilities of other participants' models on each client's data so that everyone can select the right collaborators for learning. As a result, our algorithm outperforms other federated, personalized, and/or decentralized approaches on several benchmark datasets, being the only approach that consistently performs better than training with local data only.
Published: 2022

19. Verifying the Union of Manifolds Hypothesis for Image Data

Author: Brown, Bradley C. A., Caterini, Anthony L., Ross, Brendan Leigh, Cresswell, Jesse C., and Loaiza-Ganem, Gabriel
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Deep learning has had tremendous success at learning low-dimensional representations of high-dimensional data. This success would be impossible if there was no hidden low-dimensional structure in data of interest; this existence is posited by the manifold hypothesis, which states that the data lies on an unknown manifold of low intrinsic dimension. In this paper, we argue that this hypothesis does not properly capture the low-dimensional structure typically present in image data. Assuming that data lies on a single manifold implies intrinsic dimension is identical across the entire data space, and does not allow for subregions of this space to have a different number of factors of variation. To address this deficiency, we consider the union of manifolds hypothesis, which states that data lies on a disjoint union of manifolds of varying intrinsic dimensions. We empirically verify this hypothesis on commonly-used image datasets, finding that indeed, observed data lies on a disconnected set and that intrinsic dimension is not constant. We also provide insights into the implications of the union of manifolds hypothesis in deep learning, both supervised and unsupervised, showing that designing models with an inductive bias for this structure improves performance across classification and generative modelling tasks. Our code is available at https://github.com/layer6ai-labs/UoMH., Comment: ICLR 2023
Published: 2022

20. Neural Implicit Manifold Learning for Topology-Aware Density Estimation

Author: Ross, Brendan Leigh, Loaiza-Ganem, Gabriel, Caterini, Anthony L., and Cresswell, Jesse C.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Natural data observed in $\mathbb{R}^n$ is often constrained to an $m$-dimensional manifold $\mathcal{M}$, where $m < n$. This work focuses on the task of building theoretically principled generative models for such data. Current generative models learn $\mathcal{M}$ by mapping an $m$-dimensional latent variable through a neural network $f_\theta: \mathbb{R}^m \to \mathbb{R}^n$. These procedures, which we call pushforward models, incur a straightforward limitation: manifolds cannot in general be represented with a single parameterization, meaning that attempts to do so will incur either computational instability or the inability to learn probability densities within the manifold. To remedy this problem, we propose to model $\mathcal{M}$ as a neural implicit manifold: the set of zeros of a neural network. We then learn the probability density within $\mathcal{M}$ with a constrained energy-based model, which employs a constrained variant of Langevin dynamics to train and sample from the learned manifold. In experiments on synthetic and natural data, we show that our model can learn manifold-supported distributions with complex topologies more accurately than pushforward models., Comment: Accepted to TMLR in 2023. Code: https://github.com/layer6ai-labs/implicit-manifolds
Published: 2022

21. Disparate Impact in Differential Privacy from Gradient Misalignment

Author: Esipova, Maria S., Ghomi, Atiyeh Ashari, Luo, Yaqiao, and Cresswell, Jesse C.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security
Abstract: As machine learning becomes more widespread throughout society, aspects including data privacy and fairness must be carefully considered, and are crucial for deployment in highly regulated industries. Unfortunately, the application of privacy enhancing technologies can worsen unfair tendencies in models. In particular, one of the most widely used techniques for private model training, differentially private stochastic gradient descent (DPSGD), frequently intensifies disparate impact on groups within data. In this work we study the fine-grained causes of unfairness in DPSGD and identify gradient misalignment due to inequitable gradient clipping as the most significant source. This observation leads us to a new method for reducing unfairness by preventing gradient misalignment in DPSGD., Comment: ICLR 2023 notable top 25%, https://openreview.net/forum?id=qLOaeRvteqbx. Our code is available at https://github.com/layer6ai-labs/fair-dp
Published: 2022

22. Diagnosing and Fixing Manifold Overfitting in Deep Generative Models

Author: Loaiza-Ganem, Gabriel, Ross, Brendan Leigh, Cresswell, Jesse C., and Caterini, Anthony L.
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Statistics - Methodology
Abstract: Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities. This formulation directly contradicts the manifold hypothesis, which states that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space. In this paper we investigate the pathologies of maximum-likelihood training in the presence of this dimensionality mismatch. We formally prove that degenerate optima are achieved wherein the manifold itself is learned but not the distribution on it, a phenomenon we call manifold overfitting. We propose a class of two-step procedures consisting of a dimensionality reduction step followed by maximum-likelihood density estimation, and prove that they recover the data-generating distribution in the nonparametric regime, thus avoiding manifold overfitting. We also show that these procedures enable density estimation on the manifolds learned by implicit models, such as generative adversarial networks, hence addressing a major shortcoming of these models. Several recently proposed methods are instances of our two-step procedures; we thus unify, extend, and theoretically justify a large class of models., Comment: Accepted for publication in TMLR
Published: 2022

23. Decentralized Federated Learning through Proxy Model Sharing

Author: Kalra, Shivam, Wen, Junfeng, Cresswell, Jesse C., Volkovs, Maksims, and Tizhoosh, Hamid R.
Subjects: Computer Science - Machine Learning
Abstract: Institutions in highly regulated domains such as finance and healthcare often have restrictive rules around data sharing. Federated learning is a distributed learning framework that enables multi-institutional collaborations on decentralized data with improved protection for each collaborator's data privacy. In this paper, we propose a communication-efficient scheme for decentralized federated learning called ProxyFL, or proxy-based federated learning. Each participant in ProxyFL maintains two models, a private model, and a publicly shared proxy model designed to protect the participant's privacy. Proxy models allow efficient information exchange among participants without the need of a centralized server. The proposed method eliminates a significant limitation of canonical federated learning by allowing model heterogeneity; each participant can have a private model with any architecture. Furthermore, our protocol for communication by proxy leads to stronger privacy guarantees using differential privacy analysis. Experiments on popular image datasets, and a cancer diagnostic problem using high-quality gigapixel histology whole slide images, show that ProxyFL can outperform existing alternatives with much less communication overhead and stronger privacy.
Published: 2021
Full Text: View/download PDF

24. Decentralized federated learning through proxy model sharing

Author: Kalra, Shivam, Wen, Junfeng, Cresswell, Jesse C., Volkovs, Maksims, and Tizhoosh, H. R.
Published: 2023
Full Text: View/download PDF

25. Tractable Density Estimation on Learned Manifolds with Conformal Embedding Flows

Author: Ross, Brendan Leigh and Cresswell, Jesse C.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: Normalizing flows are generative models that provide tractable density estimation via an invertible transformation from a simple base distribution to a complex target distribution. However, this technique cannot directly model data supported on an unknown low-dimensional manifold, a common occurrence in real-world domains such as image data. Recent attempts to remedy this limitation have introduced geometric complications that defeat a central benefit of normalizing flows: exact density estimation. We recover this benefit with Conformal Embedding Flows, a framework for designing flows that learn manifolds with tractable densities. We argue that composing a standard flow with a trainable conformal embedding is the most natural way to model manifold-supported data. To this end, we present a series of conformal building blocks and apply them in experiments with synthetic and real-world data to demonstrate that flows can model manifold-supported distributions without sacrificing tractable likelihoods., Comment: NeurIPS 2021 Camera-Ready. Code: https://github.com/layer6ai-labs/CEF
Published: 2021

26. C-Learning: Horizon-Aware Cumulative Accessibility Estimation

Author: Naderian, Panteha, Loaiza-Ganem, Gabriel, Braviner, Harry J., Caterini, Anthony L., Cresswell, Jesse C., Li, Tong, and Garg, Animesh
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Multi-goal reaching is an important problem in reinforcement learning needed to achieve algorithmic generalization. Despite recent advances in this field, current algorithms suffer from three major challenges: high sample complexity, learning only a single way of reaching the goals, and difficulties in solving complex motion planning tasks. In order to address these limitations, we introduce the concept of cumulative accessibility functions, which measure the reachability of a goal from a given state within a specified horizon. We show that these functions obey a recurrence relation, which enables learning from offline interactions. We also prove that optimal cumulative accessibility functions are monotonic in the planning horizon. Additionally, our method can trade off speed and reliability in goal-reaching by suggesting multiple paths to a single goal depending on the provided horizon. We evaluate our approach on a set of multi-goal discrete and continuous control tasks. We show that our method outperforms state-of-the-art goal-reaching algorithms in success rate, sample complexity, and path optimality. Our code is available at https://github.com/layer6ai-labs/CAE, and additional visualizations can be found at https://sites.google.com/view/learning-cae/., Comment: Accepted at ICLR 2021
Published: 2020

27. Quantum Information Approaches to Quantum Gravity

Author: Cresswell, Jesse C.
Subjects: High Energy Physics - Theory, Quantum Physics
Abstract: In this thesis we apply techniques from quantum information theory to study quantum gravity within the framework of the anti-de Sitter / conformal field theory correspondence (AdS/CFT). Through AdS/CFT, progress has been made in understanding the structure of entanglement in quantum field theories, and in how gravitational physics can emerge from these structures. However, this understanding is far from complete and will require the development of new tools to quantify correlations in CFT. This thesis presents refinements of a duality between operator product expansion (OPE) blocks in the CFT, giving the contribution of a conformal family to the OPE, and geodesic integrated fields in AdS which are diffeomorphism invariant quantities. This duality was originally discovered in the maximally symmetric setting of pure AdS dual to the CFT ground state. In less symmetric states the duality must be modified. Working with excited states within AdS$_3$/CFT$_2$, this thesis shows how the OPE block decomposes into more fine-grained CFT observables that are dual to AdS fields integrated over non-minimal geodesics. Additionally, this thesis contains results on the dynamics of entanglement measures for general quantum systems. Results are presented for the family of quantum R\'enyi entropies and entanglement negativity. R\'enyi entropies are studied for general dynamics by imposing special initial conditions. Around pure, separable initial states, all R\'enyi entropies grow with the same timescale at leading, and next-to-leading order. Mathematical tools are developed for the differentiation of non-analytic matrix functions with respect to constrained arguments and are used to construct analytic expressions for derivatives of negativity. We establish bounds on the rate of change of state distinguishability and the rate of entanglement growth for closed systems. Note: Abstract shortened., Comment: PhD Thesis, contains previously published papers and additional work, 135 pages
Published: 2019

28. Operational symmetries of entangled states

Author: Tzitrin, Ilan, Goldberg, Aaron Z., and Cresswell, Jesse C.
Subjects: Quantum Physics, High Energy Physics - Theory
Abstract: Quantum entanglement obscures the notion of local operations; there exist quantum states for which all local actions on one subsystem can be equivalently realized by actions on another. We characterize the states for which this fundamental property of entanglement does and does not hold, including multipartite and mixed states. Our results lead to a method for quantifying entanglement based on operational symmetries and has connections to quantum steering, envariance, the Reeh-Schlieder theorem, and classical entanglement., Comment: 8 pages including 2 appendices and 2 figures; comments welcome
Published: 2019
Full Text: View/download PDF

29. Holographic relations for OPE blocks in excited states

Author: Cresswell, Jesse C., Jardine, Ian T., and Peet, Amanda W.
Subjects: High Energy Physics - Theory
Abstract: We study the holographic duality between boundary OPE blocks and geodesic integrated bulk fields in quotients of AdS$_3$ dual to excited CFT states. The quotient geometries exhibit non-minimal geodesics between pairs of spacelike separated boundary points which modify the OPE block duality. We decompose OPE blocks into quotient invariant operators and propose a duality with bulk fields integrated over individual geodesics, minimal or non-minimal. We provide evidence for this relationship by studying the monodromy of asymptotic maps that implement the quotients., Comment: 22 pages. As published in JHEP
Published: 2018
Full Text: View/download PDF

30. Perturbative expansion of entanglement negativity using patterned matrix calculus

Author: Cresswell, Jesse C., Tzitrin, Ilan, and Goldberg, Aaron Z.
Subjects: Quantum Physics, Condensed Matter - Statistical Mechanics, High Energy Physics - Theory
Abstract: Negativity is an entanglement monotone frequently used to quantify entanglement in bipartite states. Because negativity is a non-analytic function of a density matrix, existing methods used in the physics literature are insufficient to compute its derivatives. To this end we develop techniques in the calculus of complex, patterned matrices and use them to conduct a perturbative analysis of negativity in terms of arbitrary variations of the density operator. The result is an easy-to-implement expansion that can be carried out to all orders. On the way we provide convenient representations of the partial transposition map appearing in the definition of negativity. Our methods are well-suited to study the growth and decay of entanglement in a wide range of physical systems, including the generic linear growth of entanglement in many-body systems, and have broad relevance to many functions of quantum states and observables., Comment: 10 pages, 3 figures; as published in PRA
Published: 2018
Full Text: View/download PDF

31. Federated learning and differential privacy for medical image analysis

Author: Adnan, Mohammed, Kalra, Shivam, Cresswell, Jesse C., Taylor, Graham W., and Tizhoosh, Hamid R.
Published: 2022
Full Text: View/download PDF

32. Universal entanglement timescale for R\'enyi entropies

Author: Cresswell, Jesse C.
Subjects: Quantum Physics, High Energy Physics - Theory
Abstract: Recently it was shown that the growth of entanglement in an initially separable state, as measured by the purity of subsystems, can be characterized by a timescale that takes a universal form for any Hamiltonian. We show that the same timescale governs the growth of entanglement for all R\'enyi entropies. Since the family of R\'enyi entropies completely characterizes the entanglement of a pure bipartite state, our timescale is a universal feature of bipartite entanglement. The timescale depends only on the interaction Hamiltonian and the initial state., Comment: 6 pages, 6 figures. As published in PRA
Published: 2017
Full Text: View/download PDF

33. Kinematic space for conical defects

Author: Cresswell, Jesse C. and Peet, Amanda W.
Subjects: High Energy Physics - Theory
Abstract: Kinematic space can be used as an intermediate step in the AdS/CFT dictionary and lends itself naturally to the description of diffeomorphism invariant quantities. From the bulk it has been defined as the space of boundary anchored geodesics, and from the boundary as the space of pairs of CFT points. When the bulk is not globally AdS$_3$ the appearance of non-minimal geodesics leads to ambiguities in these definitions. In this work conical defect spacetimes are considered as an example where non-minimal geodesics are common. From the bulk it is found that the conical defect kinematic space can be obtained from the AdS$_3$ kinematic space by the same quotient under which one obtains the defect from AdS$_3$. The resulting kinematic space is one of many equivalent fundamental regions. From the boundary the conical defect kinematic space can be determined by breaking up OPE blocks into contributions from individual bulk geodesics. A duality is established between partial OPE blocks and bulk fields integrated over individual geodesics, minimal or non-minimal., Comment: 29 pages, 9 figures. As published in JHEP
Published: 2017
Full Text: View/download PDF

34. Lorenz gauge quantization in conformally flat spacetimes

Author: Cresswell, Jesse C. and Vollick, Dan N.
Subjects: General Relativity and Quantum Cosmology
Abstract: Recently it was shown that Dirac's method of quantizing constrained dynamical systems can be used to impose the Lorenz gauge condition in a four-dimensional cosmological spacetime. In this paper we use Dirac's method to impose the Lorenz gauge condition in a general four-dimensional conformally flat spacetime and find that there is no particle production. We show that in cosmological spacetimes with dimension $D\neq 4$ there will be particle production when the scale factor changes, and we calculate the particle production due to a sudden change., Comment: 8 pages
Published: 2015
Full Text: View/download PDF

35. Holographic relations for OPE blocks in excited states

Author: Cresswell, Jesse C., Jardine, Ian T., and Peet, Amanda W.
Published: 2019
Full Text: View/download PDF

36. Federated Learning and Differential Privacy for Medical Image Analysis

Author: Adnan, Mohammed, primary, Kalra, Shivam, additional, Cresswell, Jesse C., additional, Taylor, Graham W., additional, and Tizhoosh, Hamid, additional
Published: 2021
Full Text: View/download PDF

37. Operational symmetries of entangled states

Author: Tzitrin, Ilan, primary, Goldberg, Aaron Z, additional, and Cresswell, Jesse C, additional
Published: 2020
Full Text: View/download PDF

38. Perturbative expansion of entanglement negativity using patterned matrix calculus

Author: Cresswell, Jesse C., primary, Tzitrin, Ilan, additional, and Goldberg, Aaron Z., additional
Published: 2019
Full Text: View/download PDF

39. Entanglement as an operational symmetry

Author: Goldberg, Aaron Z., primary, Cresswell, Jesse C., additional, and Tzitrin, Ilan, additional
Published: 2019
Full Text: View/download PDF

40. Universal entanglement timescale for Rényi entropies

Author: Cresswell, Jesse C., primary
Published: 2018
Full Text: View/download PDF

41. Kinematic space for conical defects

Author: Cresswell, Jesse C., primary and Peet, Amanda W., additional
Published: 2017
Full Text: View/download PDF

42. Lorenz gauge quantization in conformally flat spacetimes

Author: Cresswell, Jesse C., primary and Vollick, Dan N., additional
Published: 2015
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

42 results on '"Cresswell, Jesse C."'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources