20 results on '"Ryu, J. Jon"'
Search Results
2. Are Uncertainty Quantification Capabilities of Evidential Deep Learning a Mirage?
- Author
-
Shen, Maohao, Ryu, J. Jon, Ghosh, Soumya, Bu, Yuheng, Sattigeri, Prasanna, Das, Subhro, and Wornell, Gregory W.
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
This paper questions the effectiveness of a modern predictive uncertainty quantification approach, called \emph{evidential deep learning} (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their perceived strong empirical performance on downstream tasks, a line of recent studies by Bengs et al. identify limitations of the existing methods to conclude their learned epistemic uncertainties are unreliable, e.g., in that they are non-vanishing even with infinite data. Building on and sharpening such analysis, we 1) provide a sharper understanding of the asymptotic behavior of a wide class of EDL methods by unifying various objective functions; 2) reveal that the EDL methods can be better interpreted as an out-of-distribution detection algorithm based on energy-based-models; and 3) conduct extensive ablation studies to better assess their empirical effectiveness with real-world datasets. Through all these analyses, we conclude that even when EDL methods are empirically effective on downstream tasks, this occurs despite their poor uncertainty quantification capabilities. Our investigation suggests that incorporating model uncertainty can help EDL methods faithfully quantify uncertainties and further improve performance on representative downstream tasks, albeit at the cost of additional computational complexity., Comment: 29 pages, 12 figures
- Published
- 2024
3. Gambling-Based Confidence Sequences for Bounded Random Vectors
- Author
-
Ryu, J. Jon and Wornell, Gregory W.
- Subjects
Statistics - Methodology ,Computer Science - Information Theory ,Mathematics - Statistics Theory - Abstract
A confidence sequence (CS) is a sequence of confidence sets that contains a target parameter of an underlying stochastic process at any time step with high probability. This paper proposes a new approach to constructing CSs for means of bounded multivariate stochastic processes using a general gambling framework, extending the recently established coin toss framework for bounded random processes. The proposed gambling framework provides a general recipe for constructing CSs for categorical and probability-vector-valued observations, as well as for general bounded multidimensional observations through a simple reduction. This paper specifically explores the use of the mixture portfolio, akin to Cover's universal portfolio, in the proposed framework and investigates the properties of the resulting CSs. Simulations demonstrate the tightness of these confidence sequences compared to existing methods. When applied to the sampling without-replacement setting for finite categorical data, it is shown that the resulting CS based on a universal gambling strategy is provably tighter than that of the posterior-prior ratio martingale proposed by Waudby-Smith and Ramdas., Comment: 14 pages, 3 figures. ICML 2024
- Published
- 2024
4. Operator SVD with Neural Networks via Nested Low-Rank Approximation
- Author
-
Ryu, J. Jon, Xu, Xiangxiang, Erol, H. S. Melihcan, Bu, Yuheng, Zheng, Lizhong, and Wornell, Gregory W.
- Subjects
Computer Science - Machine Learning ,Mathematics - Numerical Analysis ,Statistics - Machine Learning - Abstract
Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called \emph{nesting} for learning the top-$L$ singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning., Comment: 36 pages, 7 figures. ICML 2024. Almost identical to the conference version, except a few updates for fixing typos and mistakes
- Published
- 2024
5. On Confidence Sequences for Bounded Random Processes via Universal Gambling Strategies
- Author
-
Ryu, J. Jon and Bhatt, Alankrita
- Subjects
Mathematics - Probability ,Computer Science - Information Theory ,Statistics - Methodology - Abstract
This paper considers the problem of constructing a confidence sequence, which is a sequence of confidence intervals that hold uniformly over time, for estimating the mean of bounded real-valued random processes. This paper revisits the gambling-based approach established in the recent literature from a natural \emph{two-horse race} perspective, and demonstrates new properties of the resulting algorithm induced by Cover (1991)'s universal portfolio. The main result of this paper is a new algorithm based on a mixture of lower bounds, which closely approximates the performance of Cover's universal portfolio with constant per-round time complexity. A higher-order generalization of a lower bound on a logarithmic function in (Fan et al., 2015), which is developed as a key technique for the proposed algorithm, may be of independent interest., Comment: 20 pages, 3 figures. IEEE Transactions on Information Theory (to appear)
- Published
- 2022
- Full Text
- View/download PDF
6. An Information-Theoretic Proof of the Kac--Bernstein Theorem
- Author
-
Ryu, J. Jon and Kim, Young-Han
- Subjects
Computer Science - Information Theory ,Mathematics - Probability - Abstract
A short, information-theoretic proof of the Kac--Bernstein theorem, which is stated as follows, is presented: For any independent random variables $X$ and $Y$, if $X+Y$ and $X-Y$ are independent, then $X$ and $Y$ are normally distributed., Comment: 4 pages
- Published
- 2022
7. Minimax Optimal Algorithms with Fixed-$k$-Nearest Neighbors
- Author
-
Ryu, J. Jon and Kim, Young-Han
- Subjects
Mathematics - Statistics Theory ,Computer Science - Distributed, Parallel, and Cluster Computing ,Computer Science - Information Theory ,Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
This paper presents how to perform minimax optimal classification, regression, and density estimation based on fixed-$k$ nearest neighbor (NN) searches. We consider a distributed learning scenario, in which a massive dataset is split into smaller groups, where the $k$-NNs are found for a query point with respect to each subset of data. We propose \emph{optimal} rules to aggregate the fixed-$k$-NN information for classification, regression, and density estimation that achieve minimax optimal rates for the respective problems. We show that the distributed algorithm with a fixed $k$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions. Roughly speaking, distributed $k$-NN rules with $M$ groups has a performance comparable to the standard $\Theta(kM)$-NN rules even for fixed $k$., Comment: 65 pages, 5 figures. The manuscript has been revised from scratch compared to the previous version. Notable differences include (1) updated statements and corrected proofs for classification and regression, (2) explicit statements and proofs for distance-selective rules, and (3) new analogous estimators for density estimation
- Published
- 2022
8. On Universal Portfolios with Continuous Side Information
- Author
-
Bhatt, Alankrita, Ryu, J. Jon, and Kim, Young-Han
- Subjects
Computer Science - Information Theory - Abstract
A new portfolio selection strategy that adapts to a continuous side-information sequence is presented, with a universal wealth guarantee against a class of state-constant rebalanced portfolios with respect to a state function that maps each side-information symbol to a finite set of states. In particular, given that a state function belongs to a collection of functions of finite Natarajan dimension, the proposed strategy is shown to achieve, asymptotically to first order in the exponent, the same wealth as the best state-constant rebalanced portfolio with respect to the best state function, chosen in hindsight from observed market. This result can be viewed as an extension of the seminal work of Cover and Ordentlich (1996) that assumes a single state function.
- Published
- 2022
9. Parameter-free Online Linear Optimization with Side Information via Universal Coin Betting
- Author
-
Ryu, J. Jon, Bhatt, Alankrita, and Kim, Young-Han
- Subjects
Computer Science - Information Theory ,Computer Science - Machine Learning ,Mathematics - Optimization and Control - Abstract
A class of parameter-free online linear optimization algorithms is proposed that harnesses the structure of an adversarial sequence by adapting to some side information. These algorithms combine the reduction technique of Orabona and P{\'a}l (2016) for adapting coin betting algorithms for online linear optimization with universal compression techniques in information theory for incorporating sequential side information to coin betting. Concrete examples are studied in which the side information has a tree structure and consists of quantized values of the previous symbols of the adversarial sequence, including fixed-order and variable-order Markov cases. By modifying the context-tree weighting technique of Willems, Shtarkov, and Tjalkens (1995), the proposed algorithm is further refined to achieve the best performance over all adaptive algorithms with tree-structured side information of a given maximum order in a computationally efficient manner., Comment: 23 pages, 5 figures, to appear at AISTATS 2022
- Published
- 2022
10. Feedback Recurrent AutoEncoder
- Author
-
Yang, Yang, Sautière, Guillaume, Ryu, J. Jon, and Cohen, Taco S
- Subjects
Computer Science - Machine Learning ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing ,Statistics - Machine Learning - Abstract
In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently extract the redundancy along the time dimension and allows a compact discrete representation of the data to be learned. We demonstrate its effectiveness in speech spectrogram compression. Specifically, we show that the FRAE, paired with a powerful neural vocoder, can produce high-quality speech waveforms at a low, fixed bitrate. We further show that by adding a learned prior for the latent space and using an entropy coder, we can achieve an even lower variable bitrate.
- Published
- 2019
11. Learning with Succinct Common Representation Based on Wyner's Common Information
- Author
-
Ryu, J. Jon, Choi, Yoojin, Kim, Young-Han, El-Khamy, Mostafa, and Lee, Jungwon
- Subjects
Computer Science - Machine Learning ,Statistics - Machine Learning - Abstract
A new bimodal generative model is proposed for generating conditional and joint samples, accompanied with a training method with learning a succinct bottleneck representation. The proposed model, dubbed as the variational Wyner model, is designed based on two classical problems in network information theory -- distributed simulation and channel synthesis -- in which Wyner's common information arises as the fundamental limit on the succinctness of the common representation. The model is trained by minimizing the symmetric Kullback--Leibler divergence between variational and model distributions with regularization terms for common information, reconstruction consistency, and latent space matching terms, which is carried out via an adversarial density ratio estimation technique. The utility of the proposed approach is demonstrated through experiments for joint and conditional generation with synthetic and real-world datasets, as well as a challenging zero-shot image retrieval task., Comment: 20 pages, 7 figures
- Published
- 2019
12. Nearest neighbor density functional estimation from inverse Laplace transform
- Author
-
Ryu, J. Jon, Ganguly, Shouvik, Kim, Young-Han, Noh, Yung-Kyun, and Lee, Daniel D.
- Subjects
Mathematics - Statistics Theory ,Computer Science - Information Theory ,Statistics - Methodology ,Statistics - Machine Learning - Abstract
A new approach to $L_2$-consistent estimation of a general density functional using $k$-nearest neighbor distances is proposed, where the functional under consideration is in the form of the expectation of some function $f$ of the densities at each point. The estimator is designed to be asymptotically unbiased, using the convergence of the normalized volume of a $k$-nearest neighbor ball to a Gamma distribution in the large-sample limit, and naturally involves the inverse Laplace transform of a scaled version of the function $f.$ Some instantiations of the proposed estimator recover existing $k$-nearest neighbor based estimators of Shannon and R\'enyi entropies and Kullback--Leibler and R\'enyi divergences, and discover new consistent estimators for many other functionals such as logarithmic entropies and divergences. The $L_2$-consistency of the proposed estimator is established for a broad class of densities for general functionals, and the convergence rate in mean squared error is established as a function of the sample size for smooth, bounded densities., Comment: 43 pages, 4 figures. IEEE Transactions on Information Theory (to appear)
- Published
- 2018
- Full Text
- View/download PDF
13. On Confidence Sequences for Bounded Random Processes via Universal Gambling Strategies
- Author
-
Ryu, J. Jon and Bhatt, Alankrita
- Abstract
This paper considers the problem of constructing a confidence sequence, which is a sequence of confidence intervals that hold uniformly over time, for estimating the mean of bounded real-valued random processes. This paper revisits the gambling-based approach established in the recent literature from a natural two-horse race perspective, and demonstrates new properties of the resulting algorithm induced by Cover (1991)’s universal portfolio. The main result of this paper is a new algorithm based on a mixture of lower bounds, which closely approximates the performance of Cover’s universal portfolio with constant per-round time complexity. A higher-order generalization of a lower bound on a logarithmic function in (Fan et al., 2015), which is developed as a key technique for the proposed algorithm, may be of independent interest.
- Published
- 2024
- Full Text
- View/download PDF
14. Improved Evidential Deep Learning via a Mixture of Dirichlet Distributions
- Author
-
Ryu, J. Jon, Shen, Maohao, Ghosh, Soumya, Bu, Yuheng, Sattigeri, Prasanna, Das, Subhro, Wornell, Gregory W., Ryu, J. Jon, Shen, Maohao, Ghosh, Soumya, Bu, Yuheng, Sattigeri, Prasanna, Das, Subhro, and Wornell, Gregory W.
- Abstract
This paper explores a modern predictive uncertainty estimation approach, called evidential deep learning (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their strong empirical performance, recent studies by Bengs et al. identify a fundamental pitfall of the existing methods: the learned epistemic uncertainty may not vanish even in the infinite-sample limit. We corroborate the observation by providing a unifying view of a class of widely used objectives from the literature. Our analysis reveals that the EDL methods essentially train a meta distribution by minimizing a certain divergence measure between the distribution and a sample-size-independent target distribution, resulting in spurious epistemic uncertainty. Grounded in theoretical principles, we propose learning a consistent target distribution by modeling it with a mixture of Dirichlet distributions and learning via variational inference. Afterward, a final meta distribution model distills the learned uncertainty from the target model. Experimental results across various uncertainty-based downstream tasks demonstrate the superiority of our proposed method, and illustrate the practical implications arising from the consistency and inconsistency of learned epistemic uncertainty., Comment: 18 pages, 5 figures
- Published
- 2024
15. One-Nearest-Neighbor Search is All You Need for Minimax Optimal Regression and Classification
- Author
-
Ryu, J. Jon, Kim, Young-Han, Ryu, J. Jon, and Kim, Young-Han
- Abstract
Recently, Qiao, Duan, and Cheng~(2019) proposed a distributed nearest-neighbor classification method, in which a massive dataset is split into smaller groups, each processed with a $k$-nearest-neighbor classifier, and the final class label is predicted by a majority vote among these groupwise class labels. This paper shows that the distributed algorithm with $k=1$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions, for both regression and classification problems. Roughly speaking, distributed 1-nearest-neighbor rules with $M$ groups has a performance comparable to standard $\Theta(M)$-nearest-neighbor rules. In the analysis, alternative rules with a refined aggregation method are proposed and shown to attain exact minimax optimal rates., Comment: 27 pages, 2 figures. The technical content is almost identical compared to 2202.02464v1. We fixed a few typos, updated some part of the manuscript, added a missing reference, and revised some proofs to improve readability
- Published
- 2022
16. On the Role of Eigendecomposition in Kernel Embedding
- Author
-
Ryu, J. Jon, primary, Huang, Jiun-Ting, additional, and Kim, Young-Han, additional
- Published
- 2021
- Full Text
- View/download PDF
17. Nearest Neighbor Density Functional Estimation From Inverse Laplace Transform.
- Author
-
Ryu, J. Jon, Ganguly, Shouvik, Kim, Young-Han, Noh, Yung-Kyun, and Lee, Daniel D.
- Subjects
- *
RENYI'S entropy , *GAMMA distributions , *DENSITY functionals , *LAPLACE distribution , *DENSITY , *BAYES' estimation , *NEIGHBORS - Abstract
A new approach to $L_{2}$ -consistent estimation of a general density functional using $k$ -nearest neighbor distances is proposed, where the functional under consideration is in the form of the expectation of some function $f$ of the densities at each point. The estimator is designed to be asymptotically unbiased, using the convergence of the normalized volume of a $k$ -nearest neighbor ball to a Gamma distribution in the large-sample limit, and naturally involves the inverse Laplace transform of a scaled version of the function $f$. Some instantiations of the proposed estimator recover existing $k$ -nearest neighbor based estimators of Shannon and Rényi entropies and Kullback–Leibler and Rényi divergences, and discover new consistent estimators for many other functionals such as logarithmic entropies and divergences. The $L_{2}$ -consistency of the proposed estimator is established for a broad class of densities for general functionals, and the convergence rate in mean squared error is established as a function of the sample size for smooth, bounded densities. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
18. Feedback Recurrent Autoencoder
- Author
-
Yang, Yang, primary, Sautiere, Guillaume, additional, Ryu, J. Jon, additional, and Cohen, Taco S, additional
- Published
- 2020
- Full Text
- View/download PDF
19. Variations on a Theme by Liu, Cuff, and Verdú: The Power of Posterior Sampling
- Author
-
Bhatt, Alankrita, primary, Huang, Jiun-Ting, additional, Kim, Young-Han, additional, Ryu, J. Jon, additional, and Sen, Pinar, additional
- Published
- 2018
- Full Text
- View/download PDF
20. Monte Carlo Methods for Randomized Likelihood Decoding
- Author
-
Bhatt, Alankrita, primary, Huang, Jiun-Ting, additional, Kim, Young-Han, additional, Ryu, J. Jon, additional, and Sen, Pinar, additional
- Published
- 2018
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.