Author: "Ryu, J. Jon" / Database: OAIster - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Ryu, J. Jon"' showing total 11 results

Start Over Author "Ryu, J. Jon" Database OAIster

11 results on '"Ryu, J. Jon"'

1. Improved Evidential Deep Learning via a Mixture of Dirichlet Distributions

Author: Ryu, J. Jon, Shen, Maohao, Ghosh, Soumya, Bu, Yuheng, Sattigeri, Prasanna, Das, Subhro, Wornell, Gregory W., Ryu, J. Jon, Shen, Maohao, Ghosh, Soumya, Bu, Yuheng, Sattigeri, Prasanna, Das, Subhro, and Wornell, Gregory W.
Abstract: This paper explores a modern predictive uncertainty estimation approach, called evidential deep learning (EDL), in which a single neural network model is trained to learn a meta distribution over the predictive distribution by minimizing a specific objective function. Despite their strong empirical performance, recent studies by Bengs et al. identify a fundamental pitfall of the existing methods: the learned epistemic uncertainty may not vanish even in the infinite-sample limit. We corroborate the observation by providing a unifying view of a class of widely used objectives from the literature. Our analysis reveals that the EDL methods essentially train a meta distribution by minimizing a certain divergence measure between the distribution and a sample-size-independent target distribution, resulting in spurious epistemic uncertainty. Grounded in theoretical principles, we propose learning a consistent target distribution by modeling it with a mixture of Dirichlet distributions and learning via variational inference. Afterward, a final meta distribution model distills the learned uncertainty from the target model. Experimental results across various uncertainty-based downstream tasks demonstrate the superiority of our proposed method, and illustrate the practical implications arising from the consistency and inconsistency of learned epistemic uncertainty., Comment: 18 pages, 5 figures
Published: 2024

2. Gambling-Based Confidence Sequences for Bounded Random Vectors

Author: Ryu, J. Jon, Wornell, Gregory W., Ryu, J. Jon, and Wornell, Gregory W.
Abstract: A confidence sequence (CS) is a sequence of confidence sets that contains a target parameter of an underlying stochastic process at any time step with high probability. This paper proposes a new approach to constructing CSs for means of bounded multivariate stochastic processes using a general gambling framework, extending the recently established coin toss framework for bounded random processes. The proposed gambling framework provides a general recipe for constructing CSs for categorical and probability-vector-valued observations, as well as for general bounded multidimensional observations through a simple reduction. This paper specifically explores the use of the mixture portfolio, akin to Cover's universal portfolio, in the proposed framework and investigates the properties of the resulting CSs. Simulations demonstrate the tightness of these confidence sequences compared to existing methods. When applied to the sampling without-replacement setting for finite categorical data, it is shown that the resulting CS based on a universal gambling strategy is provably tighter than that of the posterior-prior ratio martingale proposed by Waudby-Smith and Ramdas., Comment: 17 pages, 3 figures
Published: 2024

3. Operator SVD with Neural Networks via Nested Low-Rank Approximation

Author: Ryu, J. Jon, Xu, Xiangxiang, Erol, H. S. Melihcan, Bu, Yuheng, Zheng, Lizhong, Wornell, Gregory W., Ryu, J. Jon, Xu, Xiangxiang, Erol, H. S. Melihcan, Bu, Yuheng, Zheng, Lizhong, and Wornell, Gregory W.
Abstract: Computing eigenvalue decomposition (EVD) of a given linear operator, or finding its leading eigenvalues and eigenfunctions, is a fundamental task in many machine learning and scientific computing problems. For high-dimensional eigenvalue problems, training neural networks to parameterize the eigenfunctions is considered as a promising alternative to the classical numerical linear algebra techniques. This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition, accompanied by new techniques called nesting for learning the top-$L$ singular values and singular functions in the correct order. The proposed method promotes the desired orthogonality in the learned functions implicitly and efficiently via an unconstrained optimization formulation, which is easy to solve with off-the-shelf gradient-based optimization algorithms. We demonstrate the effectiveness of the proposed optimization framework for use cases in computational physics and machine learning., Comment: 44 pages, 7 figures
Published: 2024

4. On Confidence Sequences for Bounded Random Processes via Universal Gambling Strategies

Author: Ryu, J. Jon, Bhatt, Alankrita, Ryu, J. Jon, and Bhatt, Alankrita
Abstract: This paper considers the problem of constructing a confidence sequence, which is a sequence of confidence intervals that hold uniformly over time, for estimating the mean of bounded real-valued random processes. This paper revisits the gambling-based approach established in the recent literature from a natural \emph{two-horse race} perspective, and demonstrates new properties of the resulting algorithm induced by Cover (1991)'s universal portfolio. The main result of this paper is a new algorithm based on a mixture of lower bounds, which closely approximates the performance of Cover's universal portfolio with constant per-round time complexity. A higher-order generalization of a lower bound on a logarithmic function in (Fan et al., 2015), which is developed as a key technique for the proposed algorithm, may be of independent interest., Comment: 32 pages, 3 figures
Published: 2022

5. An Information-Theoretic Proof of the Kac--Bernstein Theorem

Author: Ryu, J. Jon, Kim, Young-Han, Ryu, J. Jon, and Kim, Young-Han
Abstract: A short, information-theoretic proof of the Kac--Bernstein theorem, which is stated as follows, is presented: For any independent random variables $X$ and $Y$, if $X+Y$ and $X-Y$ are independent, then $X$ and $Y$ are normally distributed., Comment: 4 pages
Published: 2022

6. One-Nearest-Neighbor Search is All You Need for Minimax Optimal Regression and Classification

Author: Ryu, J. Jon, Kim, Young-Han, Ryu, J. Jon, and Kim, Young-Han
Abstract: Recently, Qiao, Duan, and Cheng~(2019) proposed a distributed nearest-neighbor classification method, in which a massive dataset is split into smaller groups, each processed with a $k$-nearest-neighbor classifier, and the final class label is predicted by a majority vote among these groupwise class labels. This paper shows that the distributed algorithm with $k=1$ over a sufficiently large number of groups attains a minimax optimal error rate up to a multiplicative logarithmic factor under some regularity conditions, for both regression and classification problems. Roughly speaking, distributed 1-nearest-neighbor rules with $M$ groups has a performance comparable to standard $\Theta(M)$-nearest-neighbor rules. In the analysis, alternative rules with a refined aggregation method are proposed and shown to attain exact minimax optimal rates., Comment: 27 pages, 2 figures. The technical content is almost identical compared to 2202.02464v1. We fixed a few typos, updated some part of the manuscript, added a missing reference, and revised some proofs to improve readability
Published: 2022

7. Parameter-free Online Linear Optimization with Side Information via Universal Coin Betting

Author: Ryu, J. Jon, Bhatt, Alankrita, Kim, Young-Han, Ryu, J. Jon, Bhatt, Alankrita, and Kim, Young-Han
Abstract: A class of parameter-free online linear optimization algorithms is proposed that harnesses the structure of an adversarial sequence by adapting to some side information. These algorithms combine the reduction technique of Orabona and P{\'a}l (2016) for adapting coin betting algorithms for online linear optimization with universal compression techniques in information theory for incorporating sequential side information to coin betting. Concrete examples are studied in which the side information has a tree structure and consists of quantized values of the previous symbols of the adversarial sequence, including fixed-order and variable-order Markov cases. By modifying the context-tree weighting technique of Willems, Shtarkov, and Tjalkens (1995), the proposed algorithm is further refined to achieve the best performance over all adaptive algorithms with tree-structured side information of a given maximum order in a computationally efficient manner., Comment: 23 pages, 5 figures, to appear at AISTATS 2022
Published: 2022

8. On Universal Portfolios with Continuous Side Information

Author: Bhatt, Alankrita, Ryu, J. Jon, Kim, Young-Han, Bhatt, Alankrita, Ryu, J. Jon, and Kim, Young-Han
Abstract: A new portfolio selection strategy that adapts to a continuous side-information sequence is presented, with a universal wealth guarantee against a class of state-constant rebalanced portfolios with respect to a state function that maps each side-information symbol to a finite set of states. In particular, given that a state function belongs to a collection of functions of finite Natarajan dimension, the proposed strategy is shown to achieve, asymptotically to first order in the exponent, the same wealth as the best state-constant rebalanced portfolio with respect to the best state function, chosen in hindsight from observed market. This result can be viewed as an extension of the seminal work of Cover and Ordentlich (1996) that assumes a single state function.
Published: 2022

9. Feedback Recurrent AutoEncoder

Author: Yang, Yang, Sautière, Guillaume, Ryu, J. Jon, Cohen, Taco S, Yang, Yang, Sautière, Guillaume, Ryu, J. Jon, and Cohen, Taco S
Abstract: In this work, we propose a new recurrent autoencoder architecture, termed Feedback Recurrent AutoEncoder (FRAE), for online compression of sequential data with temporal dependency. The recurrent structure of FRAE is designed to efficiently extract the redundancy along the time dimension and allows a compact discrete representation of the data to be learned. We demonstrate its effectiveness in speech spectrogram compression. Specifically, we show that the FRAE, paired with a powerful neural vocoder, can produce high-quality speech waveforms at a low, fixed bitrate. We further show that by adding a learned prior for the latent space and using an entropy coder, we can achieve an even lower variable bitrate.
Published: 2019

10. Learning with Succinct Common Representation Based on Wyner's Common Information

Author: Ryu, J. Jon, Choi, Yoojin, Kim, Young-Han, El-Khamy, Mostafa, Lee, Jungwon, Ryu, J. Jon, Choi, Yoojin, Kim, Young-Han, El-Khamy, Mostafa, and Lee, Jungwon
Abstract: A new bimodal generative model is proposed for generating conditional and joint samples, accompanied with a training method with learning a succinct bottleneck representation. The proposed model, dubbed as the variational Wyner model, is designed based on two classical problems in network information theory -- distributed simulation and channel synthesis -- in which Wyner's common information arises as the fundamental limit on the succinctness of the common representation. The model is trained by minimizing the symmetric Kullback--Leibler divergence between variational and model distributions with regularization terms for common information, reconstruction consistency, and latent space matching terms, which is carried out via an adversarial density ratio estimation technique. The utility of the proposed approach is demonstrated through experiments for joint and conditional generation with synthetic and real-world datasets, as well as a challenging zero-shot image retrieval task., Comment: 20 pages, 7 figures
Published: 2019

11. Nearest neighbor density functional estimation from inverse Laplace transform

Author: Ryu, J. Jon, Ganguly, Shouvik, Kim, Young-Han, Noh, Yung-Kyun, Lee, Daniel D., Ryu, J. Jon, Ganguly, Shouvik, Kim, Young-Han, Noh, Yung-Kyun, and Lee, Daniel D.
Abstract: A new approach to $L_2$-consistent estimation of a general density functional using $k$-nearest neighbor distances is proposed, where the functional under consideration is in the form of the expectation of some function $f$ of the densities at each point. The estimator is designed to be asymptotically unbiased, using the convergence of the normalized volume of a $k$-nearest neighbor ball to a Gamma distribution in the large-sample limit, and naturally involves the inverse Laplace transform of a scaled version of the function $f.$ Some instantiations of the proposed estimator recover existing $k$-nearest neighbor based estimators of Shannon and R\'enyi entropies and Kullback--Leibler and R\'enyi divergences, and discover new consistent estimators for many other functionals such as logarithmic entropies and divergences. The $L_2$-consistency of the proposed estimator is established for a broad class of densities for general functionals, and the convergence rate in mean squared error is established as a function of the sample size for smooth, bounded densities., Comment: 43 pages, 4 figures. IEEE Transactions on Information Theory (to appear)
Published: 2018
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

11 results on '"Ryu, J. Jon"'

1. Improved Evidential Deep Learning via a Mixture of Dirichlet Distributions

2. Gambling-Based Confidence Sequences for Bounded Random Vectors

3. Operator SVD with Neural Networks via Nested Low-Rank Approximation

4. On Confidence Sequences for Bounded Random Processes via Universal Gambling Strategies

5. An Information-Theoretic Proof of the Kac--Bernstein Theorem

6. One-Nearest-Neighbor Search is All You Need for Minimax Optimal Regression and Classification

7. Parameter-free Online Linear Optimization with Side Information via Universal Coin Betting

8. On Universal Portfolios with Continuous Side Information

9. Feedback Recurrent AutoEncoder

10. Learning with Succinct Common Representation Based on Wyner's Common Information

11. Nearest neighbor density functional estimation from inverse Laplace transform

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Database

11 results on '"Ryu, J. Jon"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources