14 results on '"Romann M"'
Search Results
2. The Score-Difference Flow for Implicit Generative Modeling
- Author
-
Weber, Romann M.
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
Implicit generative modeling (IGM) aims to produce samples of synthetic data matching the characteristics of a target data distribution. Recent work (e.g. score-matching networks, diffusion models) has approached the IGM problem from the perspective of pushing synthetic source data toward the target distribution via dynamical perturbations or flows in the ambient space. In this direction, we present the score difference (SD) between arbitrary target and source distributions as a flow that optimally reduces the Kullback-Leibler divergence between them while also solving the Schroedinger bridge problem. We apply the SD flow to convenient proxy distributions, which are aligned if and only if the original distributions are aligned. We demonstrate the formal equivalence of this formulation to denoising diffusion models under certain conditions. We also show that the training of generative adversarial networks includes a hidden data-optimization sub-problem, which induces the SD flow under certain choices of loss function when the discriminator is optimal. As a result, the SD flow provides a theoretical link between model classes that individually address the three challenges of the "generative modeling trilemma" -- high sample quality, mode coverage, and fast sampling -- thereby setting the stage for a unified approach., Comment: 25 pages, 5 figures, 4 tables. To appear in Transactions on Machine Learning Research (TMLR)
- Published
- 2023
- Full Text
- View/download PDF
3. Controllable Inversion of Black-Box Face-Recognition Models via Diffusion
- Author
-
Kansy, Manuel, Raël, Anton, Mignone, Graziana, Naruniec, Jacek, Schroers, Christopher, Gross, Markus, and Weber, Romann M.
- Subjects
I.2 ,FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Graphics ,I.3.3 ,I.4 ,Computer Vision and Pattern Recognition (cs.CV) ,Computer Science - Computer Vision and Pattern Recognition ,Graphics (cs.GR) ,Machine Learning (cs.LG) - Abstract
Face recognition models embed a face image into a low-dimensional identity vector containing abstract encodings of identity-specific facial features that allow individuals to be distinguished from one another. We tackle the challenging task of inverting the latent space of pre-trained face recognition models without full model access (i.e. black-box setting). A variety of methods have been proposed in literature for this task, but they have serious shortcomings such as a lack of realistic outputs, long inference times, and strong requirements for the data set and accessibility of the face recognition model. Through an analysis of the black-box inversion problem, we show that the conditional diffusion model loss naturally emerges and that we can effectively sample from the inverse distribution even without an identity-specific loss. Our method, named identity denoising diffusion probabilistic model (ID3PM), leverages the stochastic nature of the denoising diffusion process to produce high-quality, identity-preserving face images with various backgrounds, lighting, poses, and expressions. We demonstrate state-of-the-art performance in terms of identity preservation and diversity both qualitatively and quantitatively. Our method is the first black-box face recognition model inversion method that offers intuitive control over the generation process and does not suffer from any of the common shortcomings from competing methods., Comment: 34 pages. Preprint. Under review
- Published
- 2023
- Full Text
- View/download PDF
4. High‐Resolution Neural Face Swapping for Visual Effects
- Author
-
Jacek Naruniec, Romann M. Weber, Christopher Schroers, and Leonhard Helminger
- Subjects
Artificial neural network ,Image manipulation ,Computer science ,business.industry ,Face (geometry) ,Unsupervised learning ,High resolution ,Pattern recognition ,Artificial intelligence ,business ,Computer Graphics and Computer-Aided Design ,Computing Methodologies - Published
- 2020
5. Evolution of Primate Color Vision
- Author
-
Romann M. Weber and Mark Changizi
- Published
- 2020
6. Spectrogram Feature Losses for Music Source Separation
- Author
-
Romann M. Weber, Brian McWilliams, and Abhimanyu Sahai
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Range (music) ,Sound (cs.SD) ,Computer science ,Speech recognition ,Context (language use) ,Machine Learning (stat.ML) ,02 engineering and technology ,Computer Science - Sound ,Domain (software engineering) ,Machine Learning (cs.LG) ,Statistics - Machine Learning ,Audio and Speech Processing (eess.AS) ,H.5.5 ,0202 electrical engineering, electronic engineering, information engineering ,Source separation ,Feature (machine learning) ,FOS: Electrical engineering, electronic engineering, information engineering ,I.2.6 ,business.industry ,Deep learning ,020206 networking & telecommunications ,62, 68 ,Term (time) ,Spectrogram ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
In this paper we study deep learning-based music source separation, and explore using an alternative loss to the standard spectrogram pixel-level L2 loss for model training. Our main contribution is in demonstrating that adding a high-level feature loss term, extracted from the spectrograms using a VGG net, can improve separation quality vis-a-vis a pure pixel-level loss. We show this improvement in the context of the MMDenseNet, a State-of-the-Art deep learning model for this task, for the extraction of drums and vocal sounds from songs in the musdb18 database, covering a broad range of western music genres. We believe that this finding can be generalized and applied to broader machine learning-based systems in the audio domain., Comment: Accepted for presentation at the 27th European Signal Processing Conference (EUSIPCO 2019)
- Published
- 2019
- Full Text
- View/download PDF
7. Disentangled Dynamic Representations from Unordered Data
- Author
-
Helminger, Leonhard, Djelouah, Abdelaziz, Gross, Markus, and Weber, Romann M.
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
We present a deep generative model that learns disentangled static and dynamic representations of data from unordered input. Our approach exploits regularities in sequential data that exist regardless of the order in which the data is viewed. The result of our factorized graphical model is a well-organized and coherent latent space for data dynamics. We demonstrate our method on several synthetic dynamic datasets and real video data featuring various facial expressions and head poses., Comment: Symposium on Advances in Approximate Bayesian Inference, 2018
- Published
- 2018
- Full Text
- View/download PDF
8. Telemetry Anomaly Detection System Using Machine Learning to Streamline Mission Operations
- Author
-
Michela Munoz Fernandez, Yisong Yue, and Romann M. Weber
- Subjects
Decision support system ,Mission operations ,Spacecraft ,SIMPLE (military communications protocol) ,Computer science ,business.industry ,Housekeeping (computing) ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Mars Exploration Program ,Machine learning ,computer.software_genre ,Telemetry ,Anomaly detection ,Artificial intelligence ,business ,computer - Abstract
Spacecraft housekeeping telemetry is monitored at flight control centers by the operations engineers using tools that can perform limit checking or simple trend analysis. Recent developments in machine learning techniques for anomaly detection enables the implementation of more sophisticated systems that aim to augment current state-of-the-art mission tools to provide valuable decision support for the spacecraft operators, assisting in anomaly detection and potentially saving console time for the engineers. We will show some results of the implementation of an anomaly detection tool for the NASA Mars Science Laboratory mission.
- Published
- 2017
9. Decision-tree analysis of control strategies
- Author
-
Brett R. Fajen and Romann M. Weber
- Subjects
Biomedical Research ,business.industry ,Control (management) ,Decision tree ,Experimental data ,Experimental and Cognitive Psychology ,Models, Theoretical ,Machine learning ,computer.software_genre ,Task (project management) ,Range (mathematics) ,Identification (information) ,Arts and Humanities (miscellaneous) ,Data Interpretation, Statistical ,Component (UML) ,Statistics ,Developmental and Educational Psychology ,Data Mining ,Humans ,Artificial intelligence ,Psychology ,Focus (optics) ,business ,computer ,Behavioral Research - Abstract
A major focus of research on visually guided action is the identification of control strategies that map optical information to actions. The traditional approach has been to test the behavioral predictions of a few hypothesized strategies against subject behavior in environments in which various manipulations of available information have been made. While important and compelling results have been achieved with these methods, they are potentially limited by small sets of hypotheses and the methods used to test them. In this study, we introduce a novel application of data-mining techniques in an analysis of experimental data that is able to both describe and model human behavior. This method permits the rapid testing of a wide range of possible control strategies using arbitrarily complex combinations of optical variables. Through the use of decision-tree techniques, subject data can be transformed into an easily interpretable, algorithmic form. This output can then be immediately incorporated into a working model of subject behavior. We tested the effectiveness of this method in identifying the optical information used by human subjects in a collision-avoidance task. Our results comport with published research on collision-avoidance control strategies while also providing additional insight not possible with traditional methods. Further, the modeling component of our method produces behavior that closely resembles that of the subjects upon whose data the models were based. Taken together, the findings demonstrate that data-mining techniques provide powerful new tools for analyzing human data and building models that can be applied to a wide range of perception-action tasks, even outside the visual-control setting we describe.
- Published
- 2014
10. Comparative Color Categories
- Author
-
Romann M. Weber and Mark Changizi
- Published
- 2016
11. Contents Vol. 77, 2011
- Author
-
Nancy G. Forger, Melissa M. Holmes, Yibayiri O. Sanogo, Mark A. Changizi, Alison M. Bell, Mark Band, Jeff J. Anyan, Matz Larsson, Alexandra Obregon, Ruth Morona, Druck Reinhardt Druck Basel, Marianne L. Seney, Amanda Holley, Agustín González, Satz Mengensatzproduktion, Romann M. Weber, Ritesh Kotecha, Shala J. Hankison, Lynn Bengston, Joseph Palazzo, Bruce D. Goldman, and Jesús M. López
- Subjects
Behavioral Neuroscience ,Developmental Neuroscience - Published
- 2011
12. Are wet-induced wrinkled fingers primate rain treads?
- Author
-
Joseph Palazzo, Romann M. Weber, Ritesh Kotecha, and Mark A. Changizi
- Subjects
Primates ,Hand Strength ,Rain ,Water ,Anatomy ,Biology ,Adaptation, Physiological ,Biological Evolution ,body regions ,Fingers ,Behavioral Neuroscience ,Developmental Neuroscience ,medicine ,Animals ,Humans ,Animal behavior ,medicine.symptom ,Composite material ,Wrinkle - Abstract
Wet fingers and toes eventually wrinkle, and this is commonly attributed by lay opinion to local osmotic reactions. However, nearly a century ago surgeons observed that no wrinkling occurs if a nerve to the finger has been cut. Here we provide evidence that, rather than being an accidental side effect of wetness, wet-induced wrinkles have been selected to enhance grip in wet conditions. We show that their morphology has the signature properties of drainage networks, enabling efficient removal of water from the gripped surface.
- Published
- 2010
13. Visual control strategies for the interception of moving targets on foot
- Author
-
Brett R. Fajen and Romann M. Weber
- Subjects
Ophthalmology ,medicine.medical_specialty ,Physical medicine and rehabilitation ,Computer science ,medicine ,Interception ,Visual control ,Sensory Systems ,Foot (unit) - Published
- 2011
14. Audio Feature Extraction with Convolutional Neural Autoencoders with Application to Voice Conversion
- Author
-
Elhami, Golnooshsadat and Weber, Romann M.
- Subjects
Deep Neural Networks ,Audio Processing ,Voice Conversion ,Short- Time Discrete Cosine Transformation ,Feature Extraction ,Convolutional Autoencoder - Abstract
Feature extraction is a key step in many machine learning and signal processing applications. For speech signals in particular, it is important to derive features that contain both the vocal characteristics of the speaker and the content of the speech. In this paper, we introduce a convolutional auto-encoder (CAE) to extract features from speech represented via proposed short-time discrete cosine transform (STDCT). We then introduce a deep neural mapping at the encoding bottleneck to enable converting a source speaker’s speech to a target speaker’s speech while preserving the source-speech content. We further compare this approach to clustering-based and linear mappings.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.