1,101 results on '"Bregman divergence"'
Search Results
2. The Extended Bregman Divergence and Parametric Estimation in Continuous Models.
- Author
-
Basak, Sancharee and Basu, Ayanendranath
- Abstract
Under standard regularity conditions, the maximum likelihood estimator (MLE) is the most efficient estimator at the model. However, modern practice recognizes that it is rare for the hypothesized model to hold exactly, and small departures from it are never entirely unexpected. But classical estimators like the MLE are extremely sensitive to the presence of noise in the data. Within the class of robust estimators, which constitutes parametric inference techniques designed to overcome the problems due to model misspecification and noise, minimum distance estimators have become quite popular in recent times. In particular, density-based distances under the umbrella of the Bregman divergence have been demonstrated to have several advantages. Here we will consider an extension of the ordinary Bregman divergence, and investigate the scope of parametric estimation under continuous models using this extended divergence proposal. Many of our illustrations will be based on the GSB divergence, a particular member of the extended Bregman divergence, which appears to hold high promise within the robustness area. To establish the usefulness of the proposed minimum distance estimation procedure, we will provide detailed theoretical investigations followed by substantial numerical verifications. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Understanding Higher-Order Interactions in Information Space.
- Author
-
Edelsbrunner, Herbert, Ölsböck, Katharina, and Wagner, Hubert
- Subjects
- *
UNCERTAINTY (Information theory) , *INFORMATION theory , *NON-Euclidean geometry , *METRIC spaces , *INFORMATION measurement - Abstract
Methods used in topological data analysis naturally capture higher-order interactions in point cloud data embedded in a metric space. This methodology was recently extended to data living in an information space, by which we mean a space measured with an information theoretical distance. One such setting is a finite collection of discrete probability distributions embedded in the probability simplex measured with the relative entropy (Kullback–Leibler divergence). More generally, one can work with a Bregman divergence parameterized by a different notion of entropy. While theoretical algorithms exist for this setup, there is a paucity of implementations for exploring and comparing geometric-topological properties of various information spaces. The interest of this work is therefore twofold. First, we propose the first robust algorithms and software for geometric and topological data analysis in information space. Perhaps surprisingly, despite working with Bregman divergences, our design reuses robust libraries for the Euclidean case. Second, using the new software, we take the first steps towards understanding the geometric-topological structure of these spaces. In particular, we compare them with the more familiar spaces equipped with the Euclidean and Fisher metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Proximal Galerkin: A Structure-Preserving Finite Element Method for Pointwise Bound Constraints
- Author
-
Keith, Brendan and Surowiec, Thomas M.
- Published
- 2024
- Full Text
- View/download PDF
5. Reverse em-problem based on Bregman divergence and its application to classical and quantum information theory: Reverse em-problem based on Bregman divergence
- Author
-
Hayashi, Masahito
- Published
- 2024
- Full Text
- View/download PDF
6. Meta-learning to optimise : loss functions and update rules
- Author
-
Gao, Boyan, Hospedales, Timothy, and Bilen, Hakan
- Subjects
Meta-learning ,Loss Functions ,Update Rules ,learning to learn ,invariant meta-knowledge ,learned meta-knowledge ,machine learning ,meta-learn loss functions ,parameterising a loss function ,Taylor polynomial loss ,Automated Robust Loss ,ARL ,Domain Generalisation ,Implicit Function Theorem ,Empirical Risk Minimisation ,ERM ,MetaMD ,Mirror Descent-based optimisers ,Bregman divergence - Abstract
Meta-learning, aka "learning to learn", aims to extract invariant meta-knowledge from a group of tasks in order to improve the generalisation of the base models in the novel tasks. The learned meta-knowledge takes various forms, such as neural architecture, network initialization, loss function and optimisers. In this thesis, we study learning to optimise through meta-learning with of main components, loss function learning and optimiser learning. At a high level, those two components play important roles where optimisers provide update rules to modify the model parameters through the gradient information generated from the loss function. We work on the meta-model's re-usability across tasks. In the ideal case, the learned meta-model should provide a "plug-and-play" drop-in which can be used without further modification or computational expense with any new dataset or even new model architecture. We apply these ideas to address three challenges in machine learning, namely improving the convergence rate of optimisers, learning with noisy labels, and learning models that are robust to domain shift. We first study how to meta-learn loss functions. Unlike most prior work parameterising a loss function in a black-box fashion with neural networks, we meta-learn a Taylor polynomial loss and apply it to improve the robustness of the base model to label noise in the training data. The good performance of deep neural networks relies on gold-stand labelled data. However, in practice, wrongly labelled data is common due to human error and imperfect automatic annotation processes. We draw inspiration from hand-designed losses that modify the training dynamic to reduce the impact of noisy labels. Going beyond existing hand-designed robust losses, we develop a bi-level optimisation meta-learner Automated Robust Loss (ARL) that discovers novel robust losses that outperform the best prior hand-designed robust losses. A second contribution, ITL, extends the loss function learning idea to the problem of Domain Generalisation (DG). DG is the challenging scenario of deploying a model trained on one data distribution to a novel data distribution. Compared to ARL where the target loss function is optimised by a genetic-based algorithm, ITL benefits from gradient-based optimisation of loss parameters. By leveraging the mathematical guarantee from the Implicit Function Theorem, the hypergradient required to update the loss can be efficiently computed without differentiating through the whole base model training trajectory. This reduces the computational cost dramatically in the meta-learning stage and accelerates the loss function learning process by providing a more accurate hypergradient. Applying our learned loss to the DG problem, we are able to learn base models that exhibit increased robustness to domain shift compared to the state-of-theart. Importantly, the modular plug-and-play nature of our learned loss means that it is simple to use, requiring just a few lines of code change to standard Empirical Risk Minimisation (ERM) learners. We finally study accelerating the optimisation process itself by designing a metalearning algorithm that searches for efficient optimisers, which is termed MetaMD. We tackle this problem by meta-learning Mirror Descent-based optimisers through learning the strongly convex function parameterizing a Bregman divergence. While standard meta-learners require a validation set to define a meta-objective for learning, MetaMD instead optimises the convergence rate bound. The resulting learned optimiser uniquely has mathematically guaranteed convergence and generalisation properties.
- Published
- 2023
- Full Text
- View/download PDF
7. Robust estimation for general integer-valued autoregressive models based on the exponential-polynomial divergence.
- Author
-
Kim, Byungsoo and Lee, Sangyeol
- Subjects
- *
EXPONENTIAL stability , *TIME series analysis , *POWER density , *DATA analysis - Abstract
In this study, we develop a robust estimator for integer-valued one-parameter exponential family autoregressive models, named general integer-valued autoregressive models. This model accommodates a broad class of integer-valued time series models. In particular, we propose a robust estimation method that minimizes the exponential-polynomial divergence (EPD) belonging to the Brègman divergence family. EPD subsumes the density power divergence (DPD), which has been extensively studied by many authors for the past decades. Under regularity conditions, the minimum EPD estimator (MEPDE) is shown to be consistent and asymptotically normal. Comparing the performance of MEPDE with the minimum DPD estimator, we substantiate the validity of MEPDE through a simulation study and real data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A Universal Accelerated Primal–Dual Method for Convex Optimization Problems.
- Author
-
Luo, Hao
- Subjects
- *
LYAPUNOV functions , *CONVEX functions - Abstract
This work presents a universal accelerated primal–dual method for affinely constrained convex optimization problems. It can handle both Lipschitz and Hölder gradients but does not need to know the smoothness level of the objective function. In line search part, it uses dynamically decreasing parameters and produces approximate Lipschitz constant with moderate magnitude. In addition, based on a suitable discrete Lyapunov function and tight decay estimates of some differential/difference inequalities, a universal optimal mixed-type convergence rate is established. Some numerical tests are provided to confirm the efficiency of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. On the Bregman-proximal iterative algorithm for the monotone inclusion problem in Banach spaces.
- Author
-
Tang, Yan, Zhang, Shiqing, and Cho, Yeol Je
- Subjects
- *
BANACH spaces , *ALGORITHMS , *GEOMETRY - Abstract
In this paper, we focus on the solution of a class of monotone inclusion problems in reflexive Banach spaces. To reflect the geometry of the space and the operator, a more general proximal point iterative algorithm with Bregman divergence is proposed and some strong convergence results for the proposed scheme under standard assumptions are obtained. Meanwhile, the convex optimization problems and the critical point problems are studied in the applications, and the recovery of the sparse signal is simulated in the numerical experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Divergences Induced by the Cumulant and Partition Functions of Exponential Families and Their Deformations Induced by Comparative Convexity.
- Author
-
Nielsen, Frank
- Subjects
- *
EXPONENTIAL functions , *EXPONENTIAL families (Statistics) , *INFORMATION theory , *SMOOTHNESS of functions , *ENERGY function , *MACHINE learning - Abstract
Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning, among others. An exponential family can either be normalized subtractively by its cumulant or free energy function, or equivalently normalized divisively by its partition function. Both the cumulant and partition functions are strictly convex and smooth functions inducing corresponding pairs of Bregman and Jensen divergences. It is well known that skewed Bhattacharyya distances between the probability densities of an exponential family amount to skewed Jensen divergences induced by the cumulant function between their corresponding natural parameters, and that in limit cases the sided Kullback–Leibler divergences amount to reverse-sided Bregman divergences. In this work, we first show that the α -divergences between non-normalized densities of an exponential family amount to scaled α -skewed Jensen divergences induced by the partition function. We then show how comparative convexity with respect to a pair of quasi-arithmetical means allows both convex functions and their arguments to be deformed, thereby defining dually flat spaces with corresponding divergences when ordinary convexity is preserved. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Delay-tolerant distributed Bregman proximal algorithms.
- Author
-
Chraibi, S., Iutzeler, F., Malick, J., and Rogozin, A.
- Abstract
Many problems in machine learning write as the minimization of a sum of individual loss functions over the training examples. These functions are usually differentiable but, in some cases, their gradients are not Lipschitz continuous, which compromises the use of (proximal) gradient algorithms. Fortunately, changing the geometry and using Bregman divergences can alleviate this issue in several applications, such as for Poisson linear inverse problems. However, the Bregman operation makes the aggregation of several points and gradients more involved, hindering the distribution of computations for such problems. In this paper, we propose an asynchronous variant of the Bregman proximal-gradient method, able to adapt to any centralized computing system. In particular, we prove that the algorithm copes with arbitrarily long delays and we illustrate its behaviour on distributed Poisson inverse problems. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. AN INTUITIVE INTRODUCTION TO INFORMATION GEOMETRY.
- Author
-
ADAMČÍK, MARTIN
- Subjects
CONTINUOUS distributions ,DISTRIBUTION (Probability theory) ,GEOMETRY ,PRIOR learning - Abstract
In this paper, we recover some traditional results in the geometry of probability distributions, and in particular the convergence of the alternating minimisation procedure, without actually referring to probability distributions. We will do this by discussing a new general concept of two types of points: admissible and agreeable, inspired by multi–agent uncertain reasoning and belief merging. On the one hand, this presents a unique opportunity to make traditional results accessible to a wider audience as no prior knowledge of the topic is required. On the other hand, it allows us to contemplate how a group of rational agents would seek an agreement given their beliefs without necessarily expressing it in terms of probability distributions, focusing instead on logical properties. Finally, we recover Euclidean and Hilbertian settings of discrete and continuous probability distributions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
13. Adaptive Variants of the Extrapolation from the Past Method and the Operator Extrapolation Method
- Author
-
Denysov, Serhii, Semenov, Vladimir, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Shkarlet, Serhiy, editor, Morozov, Anatoliy, editor, Palagin, Alexander, editor, Vinnikov, Dmitri, editor, Stoianov, Nikolai, editor, Zhelezniak, Mark, editor, and Kazymyr, Volodymyr, editor
- Published
- 2023
- Full Text
- View/download PDF
14. General Algorithm for Learning from Grouped Uncoupled Data and Pairwise Comparison Data
- Author
-
Kohjima, Masahiro, Nambu, Yuta, Kurauchi, Yuki, Yamamoto, Ryuji, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Tanveer, Mohammad, editor, Agarwal, Sonali, editor, Ozawa, Seiichi, editor, Ekbal, Asif, editor, and Jatowt, Adam, editor
- Published
- 2023
- Full Text
- View/download PDF
15. Variational representations of annealing paths: Bregman information under monotonic embedding
- Author
-
Brekelmans, Rob and Nielsen, Frank
- Published
- 2024
- Full Text
- View/download PDF
16. The generalized Pythagorean theorem on the compactifications of certain dually flat spaces via toric geometry
- Author
-
Fujita, Hajime
- Published
- 2024
- Full Text
- View/download PDF
17. More is less: inducing sparsity via overparameterization.
- Author
-
Chou, Hung-Hsu, Maly, Johannes, and Rauhut, Holger
- Subjects
- *
IMPLICIT bias , *COMPRESSED sensing , *DEEP learning , *LENGTH measurement - Abstract
In deep learning, it is common to overparameterize neural networks, that is, to use more parameters than training samples. Quite surprisingly training the neural network via (stochastic) gradient descent leads to models that generalize very well, while classical statistics would suggest overfitting. In order to gain understanding of this implicit bias phenomenon, we study the special case of sparse recovery (compressed sensing) which is of interest on its own. More precisely, in order to reconstruct a vector from underdetermined linear measurements, we introduce a corresponding overparameterized square loss functional, where the vector to be reconstructed is deeply factorized into several vectors. We show that, if there exists an exact solution, vanilla gradient flow for the overparameterized loss functional converges to a good approximation of the solution of minimal |$\ell _1$| -norm. The latter is well-known to promote sparse solutions. As a by-product, our results significantly improve the sample complexity for compressed sensing via gradient flow/descent on overparameterized models derived in previous works. The theory accurately predicts the recovery rate in numerical experiments. Our proof relies on analyzing a certain Bregman divergence of the flow. This bypasses the obstacles caused by non-convexity and should be of independent interest. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Block-Active ADMM to Minimize NMF with Bregman Divergences.
- Author
-
Li, Xinyao and Tyagi, Akhilesh
- Subjects
- *
MATRIX decomposition , *COMPUTER vision , *NONNEGATIVE matrices , *VISUAL fields , *MACHINE learning , *CLUSTER analysis (Statistics) , *IMAGE processing - Abstract
Over the last ten years, there has been a significant interest in employing nonnegative matrix factorization (NMF) to reduce dimensionality to enable a more efficient clustering analysis in machine learning. This technique has been applied in various image processing applications within the fields of computer vision and sensor-based systems. Many algorithms exist to solve the NMF problem. Among these algorithms, the alternating direction method of multipliers (ADMM) and its variants are one of the most popular methods used in practice. In this paper, we propose a block-active ADMM method to minimize the NMF problem with general Bregman divergences. The subproblems in the ADMM are solved iteratively by a block-coordinate-descent-type (BCD-type) method. In particular, each block is chosen directly based on the stationary condition. As a result, we are able to use much fewer auxiliary variables and the proposed algorithm converges faster than the previously proposed algorithms. From the theoretical point of view, the proposed algorithm is proved to converge to a stationary point sublinearly. We also conduct a series of numerical experiments to demonstrate the superiority of the proposed algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
19. Generalized self-supervised contrastive learning with bregman divergence for image recognition.
- Author
-
Li, Zhiyuan and Ralescu, Anca
- Subjects
- *
SUPERVISED learning , *IMAGE recognition (Computer vision) , *DISTRIBUTION (Probability theory) - Abstract
• A generalized contrastive learning framework is proposed. • Bregman divergence shows the inner connection to contrastive learning. • Proposed method learns latent features in geometric and probabilistic space. Contrastive learning techniques continue to receive a lot of attention in the self-supervised learning area. Specifically, the learned distance features can be further utilized to capture the distance between latent features in the embedding space and improve the performance of both supervised and unsupervised learning tasks. However, most contrastive learning efforts are focused on learning the geometric distance between the latent features, while the underlying probability distribution is usually ignored. To address this challenge, we propose a novel generalized contrastive loss for self-supervised learning using the Bregman divergence by investigating the hidden relationship between the contrastive loss and the Bregman divergence. Our method considers the hybrid divergence that leverages the Euclidean-based distance and probabilistic divergence, which improves the quality of self-supervised learned feature representation. Besides theory, extensive experimental results demonstrate the effectiveness of our method compared to other state-of-the-art self-supervised methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Geometry of EM and related iterative algorithms
- Author
-
Hino, Hideitsu, Akaho, Shotaro, and Murata, Noboru
- Published
- 2024
- Full Text
- View/download PDF
21. Improved face recognition method using SVM-MRF with KTBD based KCM segmentation approach
- Author
-
Rangayya, Virupakshappa, and Patil, Nagabhushan
- Published
- 2024
- Full Text
- View/download PDF
22. Generalized Fisher Kernel with Bregman Divergence
- Author
-
Figuera, Pau, Cuzzocrea, Alfredo, Bringas, Pablo García, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, García Bringas, Pablo, editor, Pérez García, Hilde, editor, Martínez de Pisón, Francisco Javier, editor, Villar Flecha, José Ramón, editor, Troncoso Lora, Alicia, editor, de la Cal, Enrique A., editor, Herrero, Álvaro, editor, Martínez Álvarez, Francisco, editor, Psaila, Giuseppe, editor, Quintián, Héctor, editor, and Corchado, Emilio, editor
- Published
- 2022
- Full Text
- View/download PDF
23. On a Nonconvex Distance-Based Clustering Problem
- Author
-
Gruzdeva, Tatiana V., Ushakov, Anton V., Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pardalos, Panos, editor, Khachay, Michael, editor, and Mazalov, Vladimir, editor
- Published
- 2022
- Full Text
- View/download PDF
24. Divergences Induced by the Cumulant and Partition Functions of Exponential Families and Their Deformations Induced by Comparative Convexity
- Author
-
Frank Nielsen
- Subjects
convex duality ,exponential family ,Bregman divergence ,Jensen divergence ,Bhattacharyya distance ,Rényi divergence ,Science ,Astrophysics ,QB460-466 ,Physics ,QC1-999 - Abstract
Exponential families are statistical models which are the workhorses in statistics, information theory, and machine learning, among others. An exponential family can either be normalized subtractively by its cumulant or free energy function, or equivalently normalized divisively by its partition function. Both the cumulant and partition functions are strictly convex and smooth functions inducing corresponding pairs of Bregman and Jensen divergences. It is well known that skewed Bhattacharyya distances between the probability densities of an exponential family amount to skewed Jensen divergences induced by the cumulant function between their corresponding natural parameters, and that in limit cases the sided Kullback–Leibler divergences amount to reverse-sided Bregman divergences. In this work, we first show that the α-divergences between non-normalized densities of an exponential family amount to scaled α-skewed Jensen divergences induced by the partition function. We then show how comparative convexity with respect to a pair of quasi-arithmetical means allows both convex functions and their arguments to be deformed, thereby defining dually flat spaces with corresponding divergences when ordinary convexity is preserved.
- Published
- 2024
- Full Text
- View/download PDF
25. Bregman Three-Operator Splitting Methods.
- Author
-
Jiang, Xin and Vandenberghe, Lieven
- Subjects
- *
ALGORITHMS - Abstract
The paper presents primal–dual proximal splitting methods for convex optimization, in which generalized Bregman distances are used to define the primal and dual proximal update steps. The methods extend the primal and dual Condat–Vũ algorithms and the primal–dual three-operator (PD3O) algorithm. The Bregman extensions of the Condat–Vũ algorithms are derived from the Bregman proximal point method applied to a monotone inclusion problem. Based on this interpretation, a unified framework for the convergence analysis of the two methods is presented. We also introduce a line search procedure for stepsize selection in the Bregman dual Condat–Vũ algorithm applied to equality-constrained problems. Finally, we propose a Bregman extension of PD3O and analyze its convergence. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
26. Unsupervised domain adaptation via transferred local Fisher discriminant analysis
- Author
-
Zandifar, Mozhdeh, Rezaei, Samaneh, and Tahmoresnezhad, Jafar
- Published
- 2023
- Full Text
- View/download PDF
27. Membership-Mappings for Data Representation Learning: A Bregman Divergence Based Conditionally Deep Autoencoder
- Author
-
Kumar, Mohit, Moser, Bernhard, Fischer, Lukas, Freudenthaler, Bernhard, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Kotsis, Gabriele, editor, Tjoa, A Min, editor, Khalil, Ismail, editor, Moser, Bernhard, editor, Mashkoor, Atif, editor, Sametinger, Johannes, editor, Fensel, Anna, editor, Martinez-Gil, Jorge, editor, Fischer, Lukas, editor, Czech, Gerald, editor, Sobieczky, Florian, editor, and Khan, Sohail, editor
- Published
- 2021
- Full Text
- View/download PDF
28. Chain Rule Optimal Transport
- Author
-
Nielsen, Frank, Sun, Ke, Celebi, Emre, Series Editor, Chen, Jingdong, Series Editor, Gopi, E. S., Series Editor, Neustein, Amy, Series Editor, Poor, H. Vincent, Series Editor, and Nielsen, Frank, editor
- Published
- 2021
- Full Text
- View/download PDF
29. On Geodesic Triangles with Right Angles in a Dually Flat Space
- Author
-
Nielsen, Frank, Celebi, Emre, Series Editor, Chen, Jingdong, Series Editor, Gopi, E. S., Series Editor, Neustein, Amy, Series Editor, Poor, H. Vincent, Series Editor, and Nielsen, Frank, editor
- Published
- 2021
- Full Text
- View/download PDF
30. Testing the fit of relational models.
- Author
-
Klimova, Anna and Rudas, Tamás
- Subjects
- *
LOG-linear models , *CHI-square distribution , *GOODNESS-of-fit tests , *ASYMPTOTIC distribution - Abstract
Relational models generalize log-linear models to arbitrary discrete sample spaces by specifying effects associated with any subsets of their cells. A relational model may include an overall effect, pertaining to every cell after a reparameterization, and in this case, the properties of the maximum likelihood estimates (MLEs) are analogous to those computed under traditional log-linear models, and the goodness-of-fit tests are also the same. If an overall effect is not present in any reparameterization, the properties of the MLEs are considerably different, and the Poisson and multinomial MLEs are not equivalent. In the Poisson case, if the overall effect is not present, the observed total is not always preserved by the MLE, and the likelihood ratio statistic has a form which can be expressed using the Bregman divergence but does not reduce to its Kullback–Leibler version. The asymptotic chi-squared distribution of the Pearson and likelihood ratio statistics holds, but the generality considered here requires extended proofs. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
31. Revisiting Chernoff Information with Likelihood Ratio Exponential Families.
- Author
-
Nielsen, Frank
- Subjects
- *
STATISTICAL hypothesis testing , *PROBABILITY measures , *INFORMATION theory , *GAUSSIAN distribution , *LIKELIHOOD ratio tests , *COVARIANCE matrices , *EXPONENTIAL families (Statistics) - Abstract
The Chernoff information between two probability measures is a statistical divergence measuring their deviation defined as their maximally skewed Bhattacharyya distance. Although the Chernoff information was originally introduced for bounding the Bayes error in statistical hypothesis testing, the divergence found many other applications due to its empirical robustness property found in applications ranging from information fusion to quantum information. From the viewpoint of information theory, the Chernoff information can also be interpreted as a minmax symmetrization of the Kullback–Leibler divergence. In this paper, we first revisit the Chernoff information between two densities of a measurable Lebesgue space by considering the exponential families induced by their geometric mixtures: The so-called likelihood ratio exponential families. Second, we show how to (i) solve exactly the Chernoff information between any two univariate Gaussian distributions or get a closed-form formula using symbolic computing, (ii) report a closed-form formula of the Chernoff information of centered Gaussians with scaled covariance matrices and (iii) use a fast numerical scheme to approximate the Chernoff information between any two multivariate Gaussian distributions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
32. Distributed quantized mirror descent for strongly convex optimization over time-varying directed graph.
- Author
-
Xiong, Menghui, Zhang, Baoyong, Yuan, Deming, and Xu, Shengyuan
- Abstract
This paper investigates a distributed strongly convex constrained optimization problem in the non-Euclidean sense, where the bit rate of the considered communication between nodes is assumed to be limited, and the communication topology is represented by a time-varying directed graph. By considering the limitation of communication capacity, the quantization technique is applied in the process of exchanging information over the network. Then a distributed quantized mirror descent (DQMD) algorithm, which uses the Bregman divergence and time-varying quantizers, is developed for the strongly convex optimization under a convex constraint set. The convergence of the developed algorithm is also analyzed. It is shown that the nodes’ state errors are bounded by some terms related to the quantization resolutions, and the sublinear upper-bounds can be guaranteed by choosing appropriate quantization resolutions. Finally, a distributed ridge regression is provided as an example to verify the validity of the proposed method. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
33. Analysis of two versions of relaxed inertial algorithms with Bregman divergences for solving variational inequalities.
- Author
-
Jolaoso, Lateef Olakunle, Sunthrayuth, Pongsakorn, Cholamjiak, Prasit, and Cho, Yeol Je
- Subjects
LIPSCHITZ continuity ,HILBERT space ,SUBGRADIENT methods ,VARIATIONAL inequalities (Mathematics) ,ALGORITHMS ,CONTINUITY - Abstract
In this paper, we introduce and analyze two new inertial-like algorithms with the Bregman divergences for solving the pseudomonotone variational inequality problem in a real Hilbert space. The first algorithm is inspired by the Halpern-type iteration and the subgradient extragradient method and the second algorithm is inspired by the Halpern-type iteration and Tseng's extragradient method. Under suitable conditions, we prove some strong convergence theorems of the proposed algorithms without assuming the Lipschitz continuity and the sequential weak continuity of the given mapping. Finally, we give some numerical experiments with various types of Bregman divergence to illustrate the main results. In fact, the results presented in this paper improve and generalize the related works in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. Block-Active ADMM to Minimize NMF with Bregman Divergences
- Author
-
Xinyao Li and Akhilesh Tyagi
- Subjects
NMF ,ADMM ,Bregman divergence ,block active ,imaging sensor ,Chemical technology ,TP1-1185 - Abstract
Over the last ten years, there has been a significant interest in employing nonnegative matrix factorization (NMF) to reduce dimensionality to enable a more efficient clustering analysis in machine learning. This technique has been applied in various image processing applications within the fields of computer vision and sensor-based systems. Many algorithms exist to solve the NMF problem. Among these algorithms, the alternating direction method of multipliers (ADMM) and its variants are one of the most popular methods used in practice. In this paper, we propose a block-active ADMM method to minimize the NMF problem with general Bregman divergences. The subproblems in the ADMM are solved iteratively by a block-coordinate-descent-type (BCD-type) method. In particular, each block is chosen directly based on the stationary condition. As a result, we are able to use much fewer auxiliary variables and the proposed algorithm converges faster than the previously proposed algorithms. From the theoretical point of view, the proposed algorithm is proved to converge to a stationary point sublinearly. We also conduct a series of numerical experiments to demonstrate the superiority of the proposed algorithm.
- Published
- 2023
- Full Text
- View/download PDF
35. Lie symmetries of the nonlinear Fokker-Planck equation based on weighted Tsallis entropy.
- Author
-
PRIPOAE, CRISTINA-LILIANA, HIRICA, IULIA-ELENA, PRIPOAE, GABRIEL- TEODOR, and PREDA, VASILE
- Subjects
- *
FOKKER-Planck equation , *NONLINEAR equations , *ENTROPY , *LYAPUNOV functions , *SYMMETRY - Abstract
We determine the nonlinear Fokker-Planck equation in one dimension, based on a weighted Tsallis entropy and we derive its associated Lie symmetries. The corresponding Lyapunov functions and Bregman divergences are also found, together with a formula linking the drift function, the diffusion function and a diffusion constant. We solve the MaxEnt problem associated to the weighted Tsallis entropy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
36. Lie Symmetries of the Nonlinear Fokker-Planck Equation Based on Weighted Kaniadakis Entropy.
- Author
-
Hirica, Iulia-Elena, Pripoae, Cristina-Liliana, Pripoae, Gabriel-Teodor, and Preda, Vasile
- Subjects
- *
FOKKER-Planck equation , *NONLINEAR equations , *ENTROPY , *SYMMETRY , *PARTIAL differential equations - Abstract
The paper studies the Lie symmetries of the nonlinear Fokker-Planck equation in one dimension, which are associated to the weighted Kaniadakis entropy. In particular, the Lie symmetries of the nonlinear diffusive equation, associated to the weighted Kaniadakis entropy, are found. The MaxEnt problem associated to the weighted Kaniadakis entropy is given a complete solution, together with the thermodynamic relations which extend the known ones from the non-weighted case. Several different, but related, arguments point out a subtle dichotomous behavior of the Kaniadakis constant k, distinguishing between the cases k ∈ (− 1 , 1) and k = ± 1 . By comparison, the Lie symmetries of the NFPEs based on Tsallis q-entropies point out six "exceptional" cases, for: q = 1 2 , q = 3 2 , q = 4 3 , q = 7 3 , q = 2 and q = 3 . [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
37. On Loss Functions and Regret Bounds for Multi-Category Classification.
- Author
-
Tan, Zhiqiang and Zhang, Xinwei
- Subjects
- *
CONVEX functions , *CLASSIFICATION , *ENTROPY - Abstract
We develop new approaches in multi-class settings for constructing loss functions and establishing corresponding regret bounds with respect to the zero-one or cost-weighted classification loss. We provide new general representations of losses by deriving inverse mappings from a concave generalized entropy to a loss through a convex dissimilarity function related to the multi-distribution $f$ -divergence. This approach is then applied to study both hinge-like losses and proper scoring rules. In the first case, we derive new hinge-like convex losses, which are tighter extensions outside the probability simplex than related hinge-like losses and geometrically simpler with fewer non-differentiable edges. We also establish a classification regret bound in general for all losses with the same generalized entropy as the zero-one loss, thereby substantially extending and improving existing results. In the second case, we identify new sets of multi-class proper scoring rules through different types of dissimilarity functions and reveal interesting relationships between various composite losses currently in use. We also establish new classification regret bounds in general for multi-class proper scoring rules and, as applications, provide simple meaningful regret bounds for two specific sets of proper scoring rules. These results generalize, for the first time, previous two-class regret bounds to multi-class settings. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. A multidimensional objective prior distribution from a scoring rule.
- Author
-
Antoniano-Villalobos, Isadora, Villa, Cristiano, and Walker, Stephen G.
- Subjects
- *
MARGINAL distributions , *A priori - Abstract
The construction of objective priors is, at best, challenging for multidimensional parameter spaces. A common practice is to assume independence and set up the joint prior as the product of marginal distributions obtained via "standard" objective methods, such as Jeffreys or reference priors. However, the assumption of independence a priori is not always reasonable, and whether it can be viewed as strictly objective is still open to discussion. In this paper, by extending a previously proposed objective approach based on scoring rules for the one dimensional case, we propose a novel objective prior for multidimensional parameter spaces which yields a dependence structure. The proposed prior has the appealing property of being proper and does not depend on the chosen model; only on the parameter space considered. • The highlights of the paper are a new proper class of prior for multivariate parameters. • It is shown to have good properties. A new version of the Lasso penalty is introduced. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. The extended Bregman divergence and parametric estimation.
- Author
-
Basak, Sancharee and Basu, Ayanendranath
- Subjects
- *
DIVERGENCE theorem , *PARAMETRIC modeling , *KALMAN filtering , *DATA modeling , *INFERENCE (Logic) - Abstract
Minimization of suitable statistical distances (between the data and model densities) is a useful technique in the field of robust inference. Apart from the class of ϕ-divergences, the Bregman divergence is extensively used for this purpose. However, since the data density must have a linear presence in the term involving both the data and model densities in this structure, several useful divergences cannot be captured by the usual Bregman form. We provide an extension of the Bregman divergence by considering an exponent of the density function as the argument rather than the density itself. Many useful divergences, that are not ordinarily Bregman divergences, can be accommodated within this extended description. Using this formulation, one can develop many new families of divergences which may be useful in robust inference. In particular, through an application of this extension, we propose the new class of the GSB divergence family. We explore the applicability of the minimum GSB divergence estimator in discrete parametric models. Simulation studies and real data examples are provided to demonstrate the performance of the estimator and to substantiate the theory developed. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
40. A Spectral Estimation Framework for Phase Retrieval via Bregman Divergence Minimization.
- Author
-
Yonel, Bariscan and Yazici, Birsen
- Subjects
SAMPLING (Process) ,QUADRATIC forms ,ART techniques ,AUTHORSHIP ,MINIMAL design ,INFORMATION theory - Abstract
In this paper, we develop a novel framework to guide the design of initial estimates for phase retrieval given measurements realized from an arbitrary forward model. Particularly, we propose a general formalism for spectral initialization as an approximate Bregman loss minimization procedure in the range of the lifted forward model, such that a search over rank-1, positive semidefinite matrices is tractable by the synthesis of a quadratic form using elementwise processing of the intensity-only measurements. Via the Bregman loss approach we transcend the Euclidean sense alignment based similarity measure between phaseless measurements that is inherent in the state-of-the art techniques in the literature, in favor of information theory inspired divergence metrics over the positive reals. We derive spectral methods that perform approximate minimization of Kullback-Leibler and Itakura-Saito divergences over phaseless measurements by using elementwise sample processing functions which are designed under a minimal distortion principle. Our formulation relates and extends existing results on model dependent design of optimal sample processing functions in the literature to a model independent sense of metric-based optimality. Numerical simulations confirm the effectiveness of our approach in imaging problems using synthetic and real data sets. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
41. Logarithmic Divergences: Geometry and Interpretation of Curvature
- Author
-
Wong, Ting-Kam Leonard, Yang, Jiaowen, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nielsen, Frank, editor, and Barbaresco, Frédéric, editor
- Published
- 2019
- Full Text
- View/download PDF
42. Geometry and Fixed-Rate Quantization in Riemannian Metric Spaces Induced by Separable Bregman Divergences
- Author
-
Gomes-Gonçalves, Erika, Gzyl, Henryk, Nielsen, Frank, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nielsen, Frank, editor, and Barbaresco, Frédéric, editor
- Published
- 2019
- Full Text
- View/download PDF
43. The Bregman Chord Divergence
- Author
-
Nielsen, Frank, Nock, Richard, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nielsen, Frank, editor, and Barbaresco, Frédéric, editor
- Published
- 2019
- Full Text
- View/download PDF
44. Gradient Methods for Problems with Inexact Model of the Objective
- Author
-
Stonyakin, Fedor S., Dvinskikh, Darina, Dvurechensky, Pavel, Kroshnin, Alexey, Kuznetsova, Olesya, Agafonov, Artem, Gasnikov, Alexander, Tyurin, Alexander, Uribe, César A., Pasechnyuk, Dmitry, Artamonov, Sergei, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Pandu Rangan, C., Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Khachay, Michael, editor, Kochetov, Yury, editor, and Pardalos, Panos, editor
- Published
- 2019
- Full Text
- View/download PDF
45. Active learning by query by committee with robust divergences
- Author
-
Hino, Hideitsu and Eguchi, Shinto
- Published
- 2023
- Full Text
- View/download PDF
46. A variational perspective on accelerated methods in optimization
- Author
-
Wibisono, Andre, Wilson, Ashia C, and Jordan, Michael I
- Subjects
Applied Mathematics ,Mathematical Sciences ,convex optimization ,accelerated methods ,Lagrangian framework ,Bregman divergence ,mirror descent - Abstract
Accelerated gradient methods play a central role in optimization, achieving optimal rates in many settings. Although many generalizations and extensions of Nesterov's original acceleration method have been proposed, it is not yet clear what is the natural scope of the acceleration concept. In this paper, we study accelerated methods from a continuous-time perspective. We show that there is a Lagrangian functional that we call the Bregman Lagrangian, which generates a large class of accelerated methods in continuous time, including (but not limited to) accelerated gradient descent, its non-Euclidean extension, and accelerated higher-order gradient methods. We show that the continuous-time limit of all of these methods corresponds to traveling the same curve in spacetime at different speeds. From this perspective, Nesterov's technique and many of its generalizations can be viewed as a systematic way to go from the continuous-time curves generated by the Bregman Lagrangian to a family of discrete-time accelerated algorithms.
- Published
- 2016
47. Bayesian Risk With Bregman Loss: A Cramér–Rao Type Bound and Linear Estimation.
- Author
-
Dytso, Alex, FauB, Michael, and Poor, H. Vincent
- Subjects
- *
MEAN square algorithms , *PROBABILITY density function - Abstract
A general class of Bayesian lower bounds when the underlying loss function is a Bregman divergence is demonstrated. This class can be considered as an extension of the Weinstein–Weiss family of bounds for the mean squared error and relies on finding a variational characterization of Bayesian risk. This approach allows for the derivation of a version of the Cramér–Rao bound that is specific to a given Bregman divergence. This new generalization of the Cramér–Rao bound reduces to the classical one when the loss function is taken to be the Euclidean norm. In order to evaluate the effectiveness of the new lower bounds, the paper also develops upper bounds on Bayesian risk, which are based on optimal linear estimators. The effectiveness of the new bound is evaluated in the Poisson noise setting. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
48. Bregman subgradient extragradient method with monotone self-adjustment stepsize for solving pseudo-monotone variational inequalities and fixed point problems.
- Author
-
Jolaoso, Lateef Olakunle and Aphane, Maggie
- Subjects
SUBGRADIENT methods ,VARIATIONAL inequalities (Mathematics) ,HILBERT space ,PRIOR learning - Abstract
Using the concept of Bregman divergence, we propose a new subgradient extragradient method for approximating a common solution of pseudo-monotone and Lipschitz continuous variational inequalities and fixed point problem in real Hilbert spaces. The algorithm uses a new self-adjustment rule for selecting the stepsize in each iteration and also, we prove a strong convergence result for the sequence generated by the algorithm without prior knowledge of the Lipschitz constant. Finally, we provide some numerical examples to illustrate the performance and accuracy of our algorithm in finite and infinite dimensional spaces. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
49. Myopic robust index tracking with Bregman divergence.
- Author
-
Penev, S., Shevchenko, P. V., and Wu, W.
- Subjects
- *
NONLINEAR equations , *ROBUST optimization , *ASSET management , *PORTFOLIO performance - Abstract
Index tracking is a popular form of asset management. Typically, a quadratic function is used to define the tracking error of a portfolio and the look back approach is applied to solve the index tracking problem. We argue that a forward looking approach is more suitable, whereby the tracking error is expressed as an expectation of a function of the difference between the returns of the index and of the portfolio. We also assume that there is model uncertainty in the distribution of the assets, hence a robust version of the optimization problem needs to be adopted. We use Bregman divergence in describing the deviation between the nominal and actual (true) distribution of the components of the index. In this scenario, we derive the optimal robust index tracking portfolio in a semi-analytical form as a solution of a system of nonlinear equations. Several numerical results are presented that allow us to compare the performance of this robust portfolio with the optimal non-robust portfolio. We show that, especially during market downturns, the robust portfolio can be very advantageous. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Prediction in Riemannian metrics derived from divergence functions.
- Author
-
Gzyl, Henryk
- Subjects
- *
RIEMANNIAN metric , *RANDOM variables , *FORECASTING , *EUCLIDEAN distance - Abstract
Divergence functions are interesting and widely used discrepancy measures. Even though they are not true distances we can use them to measure how separated two points are. Curiously enough, when they are applied to random variables they lead to a notion of best predictor that coincides with usual best predictor in Euclidean distance. From a divergence function we can derive a Riemannian metric which leads to a true distance between random variables, and the best predictors in this metric do not coincide with their Euclidean counterparts. It is the purpose of this paper to explicitly determine the best predictors in the derived metric, compare it to the estimators in divergence, and to obtain the sample estimators of the best predictors. Along the way we obtain results that relate approximations in divergence to approximations in the metric derived from it. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.