Author: "Tan, Vincent" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Tan, Vincent"' showing total 1,459 results

Start Over Author "Tan, Vincent"

1,459 results on '"Tan, Vincent"'

1. Optimal Multi-Objective Best Arm Identification with Fixed Confidence

Author: Chen, Zhirui, Karthik, P. N., Chee, Yeow Meng, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: We consider a multi-armed bandit setting with finitely many arms, in which each arm yields an $M$-dimensional vector reward upon selection. We assume that the reward of each dimension (a.k.a. {\em objective}) is generated independently of the others. The best arm of any given objective is the arm with the largest component of mean corresponding to the objective. The end goal is to identify the best arm of {\em every} objective in the shortest (expected) time subject to an upper bound on the probability of error (i.e., fixed-confidence regime). We establish a problem-dependent lower bound on the limiting growth rate of the expected stopping time, in the limit of vanishing error probabilities. This lower bound, we show, is characterised by a max-min optimisation problem that is computationally expensive to solve at each time step. We propose an algorithm that uses the novel idea of {\em surrogate proportions} to sample the arms at each time step, eliminating the need to solve the max-min optimisation problem at each step. We demonstrate theoretically that our algorithm is asymptotically optimal. In addition, we provide extensive empirical studies to substantiate the efficiency of our algorithm. While existing works on pure exploration with multi-objective multi-armed bandits predominantly focus on {\em Pareto frontier identification}, our work fills the gap in the literature by conducting a formal investigation of the multi-objective best arm identification problem., Comment: Accepted to AISTATS 2025
Published: 2025

2. Enhancing Multi-Text Long Video Generation Consistency without Tuning: Time-Frequency Analysis, Prompt Alignment, and Theory

Author: Li, Xingyao, Zhang, Fengzhuo, Pan, Jiachun, Hou, Yunlong, Tan, Vincent Y. F., and Yang, Zhuoran
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Despite the considerable progress achieved in the long video generation problem, there is still significant room to improve the consistency of the videos, particularly in terms of smoothness and transitions between scenes. We address these issues to enhance the consistency and coherence of videos generated with either single or multiple prompts. We propose the Time-frequency based temporal Attention Reweighting Algorithm (TiARA), which meticulously edits the attention score matrix based on the Discrete Short-Time Fourier Transform. Our method is supported by a theoretical guarantee, the first-of-its-kind for frequency-based methods in diffusion models. For videos generated by multiple prompts, we further investigate key factors affecting prompt interpolation quality and propose PromptBlend, an advanced prompt interpolation pipeline. The efficacy of our proposed method is validated via extensive experimental results, exhibiting consistent and impressive improvements over baseline methods. The code will be released upon acceptance., Comment: 34 pages, 11 figures
Published: 2024

3. p-Mean Regret for Stochastic Bandits

Author: Krishna, Anand, John, Philips George, Barik, Adarsh, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Computer Science and Game Theory
Abstract: In this work, we extend the concept of the $p$-mean welfare objective from social choice theory (Moulin 2004) to study $p$-mean regret in stochastic multi-armed bandit problems. The $p$-mean regret, defined as the difference between the optimal mean among the arms and the $p$-mean of the expected rewards, offers a flexible framework for evaluating bandit algorithms, enabling algorithm designers to balance fairness and efficiency by adjusting the parameter $p$. Our framework encompasses both average cumulative regret and Nash regret as special cases. We introduce a simple, unified UCB-based algorithm (Explore-Then-UCB) that achieves novel $p$-mean regret bounds. Our algorithm consists of two phases: a carefully calibrated uniform exploration phase to initialize sample means, followed by the UCB1 algorithm of Auer, Cesa-Bianchi, and Fischer (2002). Under mild assumptions, we prove that our algorithm achieves a $p$-mean regret bound of $\tilde{O}\left(\sqrt{\frac{k}{T^{\frac{1}{2|p|}}}}\right)$ for all $p \leq -1$, where $k$ represents the number of arms and $T$ the time horizon. When $-1
Published: 2024

4. Towards Understanding Why FixMatch Generalizes Better Than Supervised Learning

Author: Li, Jingyang, Pan, Jiachun, Tan, Vincent Y. F., Toh, Kim-Chuan, and Zhou, Pan
Subjects: Computer Science - Machine Learning
Abstract: Semi-supervised learning (SSL), exemplified by FixMatch (Sohn et al., 2020), has shown significant generalization advantages over supervised learning (SL), particularly in the context of deep neural networks (DNNs). However, it is still unclear, from a theoretical standpoint, why FixMatch-like SSL algorithms generalize better than SL on DNNs. In this work, we present the first theoretical justification for the enhanced test accuracy observed in FixMatch-like SSL applied to DNNs by taking convolutional neural networks (CNNs) on classification tasks as an example. Our theoretical analysis reveals that the semantic feature learning processes in FixMatch and SL are rather different. In particular, FixMatch learns all the discriminative features of each semantic class, while SL only randomly captures a subset of features due to the well-known lottery ticket hypothesis. Furthermore, we show that our analysis framework can be applied to other FixMatch-like SSL methods, e.g., FlexMatch, FreeMatch, Dash, and SoftMatch. Inspired by our theoretical analysis, we develop an improved variant of FixMatch, termed Semantic-Aware FixMatch (SA-FixMatch). Experimental results corroborate our theoretical findings and the enhanced generalization capability of SA-FixMatch.
Published: 2024

5. On the Convergence of (Stochastic) Gradient Descent for Kolmogorov--Arnold Networks

Author: Gao, Yihang and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control
Abstract: Kolmogorov--Arnold Networks (KANs), a recently proposed neural network architecture, have gained significant attention in the deep learning community, due to their potential as a viable alternative to multi-layer perceptrons (MLPs) and their broad applicability to various scientific tasks. Empirical investigations demonstrate that KANs optimized via stochastic gradient descent (SGD) are capable of achieving near-zero training loss in various machine learning (e.g., regression, classification, and time series forecasting, etc.) and scientific tasks (e.g., solving partial differential equations). In this paper, we provide a theoretical explanation for the empirical success by conducting a rigorous convergence analysis of gradient descent (GD) and SGD for two-layer KANs in solving both regression and physics-informed tasks. For regression problems, we establish using the neural tangent kernel perspective that GD achieves global linear convergence of the objective function when the hidden dimension of KANs is sufficiently large. We further extend these results to SGD, demonstrating a similar global convergence in expectation. Additionally, we analyze the global convergence of GD and SGD for physics-informed KANs, which unveils additional challenges due to the more complex loss structure. This is the first work establishing the global convergence guarantees for GD and SGD applied to optimize KANs and physics-informed KANs.
Published: 2024

6. Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits

Author: Hou, Yunlong, Tan, Vincent Y. F., and Zhong, Zixin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: We propose a {\em novel} piecewise stationary linear bandit (PSLB) model, where the environment randomly samples a context from an unknown probability distribution at each changepoint, and the quality of an arm is measured by its return averaged over all contexts. The contexts and their distribution, as well as the changepoints are unknown to the agent. We design {\em Piecewise-Stationary $\varepsilon$-Best Arm Identification$^+$} (PS$\varepsilon$BAI$^+$), an algorithm that is guaranteed to identify an $\varepsilon$-optimal arm with probability $\ge 1-\delta$ and with a minimal number of samples. PS$\varepsilon$BAI$^+$ consists of two subroutines, PS$\varepsilon$BAI and {\sc Na\"ive $\varepsilon$-BAI} (N$\varepsilon$BAI), which are executed in parallel. PS$\varepsilon$BAI actively detects changepoints and aligns contexts to facilitate the arm identification process. When PS$\varepsilon$BAI and N$\varepsilon$BAI are utilized judiciously in parallel, PS$\varepsilon$BAI$^+$ is shown to have a finite expected sample complexity. By proving a lower bound, we show the expected sample complexity of PS$\varepsilon$BAI$^+$ is optimal up to a logarithmic factor. We compare PS$\varepsilon$BAI$^+$ to baseline algorithms using numerical experiments which demonstrate its efficiency. Both our analytical and numerical results corroborate that the efficacy of PS$\varepsilon$BAI$^+$ is due to the delicate change detection and context alignment procedures embedded in PS$\varepsilon$BAI., Comment: 69 pages. Accepted to NeurIPS 2024
Published: 2024

7. Stochastic Bandits for Egalitarian Assignment

Author: Lim, Eugene, Tan, Vincent Y. F., and Soh, Harold
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We study EgalMAB, an egalitarian assignment problem in the context of stochastic multi-armed bandits. In EgalMAB, an agent is tasked with assigning a set of users to arms. At each time step, the agent must assign exactly one arm to each user such that no two users are assigned to the same arm. Subsequently, each user obtains a reward drawn from the unknown reward distribution associated with its assigned arm. The agent's objective is to maximize the minimum expected cumulative reward among all users over a fixed horizon. This problem has applications in areas such as fairness in job and resource allocations, among others. We design and analyze a UCB-based policy EgalUCB and establish upper bounds on the cumulative regret. In complement, we establish an almost-matching policy-independent impossibility result.
Published: 2024

8. Best Arm Identification with Minimal Regret

Author: Yang, Junwen, Tan, Vincent Y. F., and Jin, Tianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: Motivated by real-world applications that necessitate responsible experimentation, we introduce the problem of best arm identification (BAI) with minimal regret. This innovative variant of the multi-armed bandit problem elegantly amalgamates two of its most ubiquitous objectives: regret minimization and BAI. More precisely, the agent's goal is to identify the best arm with a prescribed confidence level $\delta$, while minimizing the cumulative regret up to the stopping time. Focusing on single-parameter exponential families of distributions, we leverage information-theoretic techniques to establish an instance-dependent lower bound on the expected cumulative regret. Moreover, we present an intriguing impossibility result that underscores the tension between cumulative regret and sample complexity in fixed-confidence BAI. Complementarily, we design and analyze the Double KL-UCB algorithm, which achieves asymptotic optimality as the confidence level tends to zero. Notably, this algorithm employs two distinct confidence bounds to guide arm selection in a randomized manner. Our findings elucidate a fresh perspective on the inherent connections between regret minimization and BAI., Comment: Preprint
Published: 2024

9. A General Framework for Clustering and Distribution Matching with Bandit Feedback

Author: Yavas, Recep Can, Huang, Yuqi, Tan, Vincent Y. F., and Scarlett, Jonathan
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning, 68T05, I.2.6
Abstract: We develop a general framework for clustering and distribution matching problems with bandit feedback. We consider a $K$-armed bandit model where some subset of $K$ arms is partitioned into $M$ groups. Within each group, the random variable associated to each arm follows the same distribution on a finite alphabet. At each time step, the decision maker pulls an arm and observes its outcome from the random variable associated to that arm. Subsequent arm pulls depend on the history of arm pulls and their outcomes. The decision maker has no knowledge of the distributions of the arms or the underlying partitions. The task is to devise an online algorithm to learn the underlying partition of arms with the least number of arm pulls on average and with an error probability not exceeding a pre-determined value~$\delta$. Several existing problems fall under our general framework, including finding $M$ pairs of arms, odd arm identification, and $N$-ary clustering of $K$ arms belong to our general framework. We derive a non-asymptotic lower bound on the average number of arm pulls for any online algorithm with an error probability not exceeding $\delta$. Furthermore, we develop a computationally-efficient online algorithm based on the Track-and-Stop method and Frank--Wolfe algorithm, and show that the average number of arm pulls of our algorithm asymptotically matches that of the lower bound. Our refined analysis also uncovers a novel bound on the speed at which the average number of arm pulls of our algorithm converges to the fundamental limit as $\delta$ vanishes., Comment: 24 pages
Published: 2024

10. A Sample Efficient Alternating Minimization-based Algorithm For Robust Phase Retrieval

Author: Barik, Adarsh, Krishna, Anand, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning
Abstract: In this work, we study the robust phase retrieval problem where the task is to recover an unknown signal $\theta^* \in \mathbb{R}^d$ in the presence of potentially arbitrarily corrupted magnitude-only linear measurements. We propose an alternating minimization approach that incorporates an oracle solver for a non-convex optimization problem as a subroutine. Our algorithm guarantees convergence to $\theta^*$ and provides an explicit polynomial dependence of the convergence rate on the fraction of corrupted measurements. We then provide an efficient construction of the aforementioned oracle under a sparse arbitrary outliers model and offer valuable insights into the geometric properties of the loss landscape in phase retrieval with corrupted measurements. Our proposed oracle avoids the need for computationally intensive spectral initialization, using a simple gradient descent algorithm with a constant step size and random initialization instead. Additionally, our overall algorithm achieves nearly linear sample complexity, $\mathcal{O}(d \, \mathrm{polylog}(d))$.
Published: 2024

11. LEARN: An Invex Loss for Outlier Oblivious Robust Online Optimization

Author: Barik, Adarsh, Krishna, Anand, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control
Abstract: We study a robust online convex optimization framework, where an adversary can introduce outliers by corrupting loss functions in an arbitrary number of rounds k, unknown to the learner. Our focus is on a novel setting allowing unbounded domains and large gradients for the losses without relying on a Lipschitz assumption. We introduce the Log Exponential Adjusted Robust and iNvex (LEARN) loss, a non-convex (invex) robust loss function to mitigate the effects of outliers and develop a robust variant of the online gradient descent algorithm by leveraging the LEARN loss. We establish tight regret guarantees (up to constants), in a dynamic setting, with respect to the uncorrupted rounds and conduct experiments to validate our theory. Furthermore, we present a unified analysis framework for developing online optimization algorithms for non-convex (invex) losses, utilizing it to provide regret bounds with respect to the LEARN loss, which may be of independent interest.
Published: 2024

12. A Mirror Descent-Based Algorithm for Corruption-Tolerant Distributed Gradient Descent

Author: Wang, Shuche and Tan, Vincent Y. F.
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Distributed gradient descent algorithms have come to the fore in modern machine learning, especially in parallelizing the handling of large datasets that are distributed across several workers. However, scant attention has been paid to analyzing the behavior of distributed gradient descent algorithms in the presence of adversarial corruptions instead of random noise. In this paper, we formulate a novel problem in which adversarial corruptions are present in a distributed learning system. We show how to use ideas from (lazy) mirror descent to design a corruption-tolerant distributed optimization algorithm. Extensive convergence analysis for (strongly) convex loss functions is provided for different choices of the stepsize. We carefully optimize the stepsize schedule to accelerate the convergence of the algorithm, while at the same time amortizing the effect of the corruption over time. Experiments based on linear regression, support vector classification, and softmax classification on the MNIST dataset corroborate our theoretical findings.
Published: 2024

13. Influence Maximization via Graph Neural Bandits

Author: Feng, Yuting, Tan, Vincent Y. F., and Cautis, Bogdan
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval, Computer Science - Social and Information Networks
Abstract: We consider a ubiquitous scenario in the study of Influence Maximization (IM), in which there is limited knowledge about the topology of the diffusion network. We set the IM problem in a multi-round diffusion campaign, aiming to maximize the number of distinct users that are influenced. Leveraging the capability of bandit algorithms to effectively balance the objectives of exploration and exploitation, as well as the expressivity of neural networks, our study explores the application of neural bandit algorithms to the IM problem. We propose the framework IM-GNB (Influence Maximization with Graph Neural Bandits), where we provide an estimate of the users' probabilities of being influenced by influencers (also known as diffusion seeds). This initial estimate forms the basis for constructing both an exploitation graph and an exploration one. Subsequently, IM-GNB handles the exploration-exploitation tradeoff, by selecting seed nodes in real-time using Graph Convolutional Networks (GCN), in which the pre-estimated graphs are employed to refine the influencers' estimated rewards in each contextual setting. Through extensive experiments on two large real-world datasets, we demonstrate the effectiveness of IM-GNB compared with other baseline methods, significantly improving the spread outcome of such diffusion campaigns, when the underlying network is unknown., Comment: To appear at the 2024 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)
Published: 2024

14. Order-Optimal Instance-Dependent Bounds for Offline Reinforcement Learning with Preference Feedback

Author: Chen, Zhirui and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Information Theory, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: We consider offline reinforcement learning (RL) with preference feedback in which the implicit reward is a linear function of an unknown parameter. Given an offline dataset, our objective consists in ascertaining the optimal action for each state, with the ultimate goal of minimizing the {\em simple regret}. We propose an algorithm, \underline{RL} with \underline{L}ocally \underline{O}ptimal \underline{W}eights or {\sc RL-LOW}, which yields a simple regret of $\exp ( - \Omega(n/H) )$ where $n$ is the number of data samples and $H$ denotes an instance-dependent hardness quantity that depends explicitly on the suboptimality gap of each action. Furthermore, we derive a first-of-its-kind instance-dependent lower bound in offline RL with preference feedback. Interestingly, we observe that the lower and upper bounds on the simple regret match order-wise in the exponent, demonstrating order-wise optimality of {\sc RL-LOW}. In view of privacy considerations in practical applications, we also extend {\sc RL-LOW} to the setting of $(\varepsilon,\delta)$-differential privacy and show, somewhat surprisingly, that the hardness parameter $H$ is unchanged in the asymptotic regime as $n$ tends to infinity; this underscores the inherent efficiency of {\sc RL-LOW} in terms of preserving the privacy of the observed rewards. Given our focus on establishing instance-dependent bounds, our work stands in stark contrast to previous works that focus on establishing worst-case regrets for offline RL with preference feedback., Comment: Accepted to Models of Human Feedback for AI Alignment Workshop, ICML 2024
Published: 2024

15. MIMO Capacity Analysis and Channel Estimation for Electromagnetic Information Theory

Author: Zhu, Jieao, Tan, Vincent Y. F., and Dai, Linglong
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: Electromagnetic information theory (EIT) is an interdisciplinary subject that serves to integrate deterministic electromagnetic theory with stochastic Shannon's information theory. Existing EIT analysis operates in the continuous space domain, which is not aligned with the practical algorithms working in the discrete space domain. This mismatch leads to a significant difficulty in application of EIT methodologies to practical discrete space systems, which is called as the discrete-continuous gap in this paper. To bridge this gap, we establish the discrete-continuous correspondence with a prolate spheroidal wave function (PSWF)-based ergodic capacity analysis framework. Specifically, we state and prove some discrete-continuous correspondence lemmas to establish a firm theoretical connection between discrete information-theoretic quantities to their continuous counterparts. With these lemmas, we apply the PSWF ergodic capacity bound to advanced MIMO architectures such as continuous-aperture MIMO (CAP-MIMO) and extremely large-scale MIMO (XL-MIMO). From this PSWF capacity bound, we discover the capacity saturation phenomenon both theoretically and empirically. Although the growth of MIMO performance is fundamentally limited in this EIT-based analysis framework, we reveal new opportunities in MIMO channel estimation by exploiting the EIT knowledge about the channel. Inspired by the PSWF capacity bound, we utilize continuous PSWFs to improve the pilot design of discrete MIMO channel estimators, which is called as the PSWF channel estimator (PSWF-CE). Simulation results demonstrate improved performances of the proposed PSWF-CE, compared to traditional minimum mean squared error (MMSE) and compressed sensing-based estimators., Comment: Submitted to the IEEE TWC. In this paper, we established the discrete-continuous correspondence for electromagnetic information theory (EIT), thus enabling analytical tools in the continuous space domain to be applied to discrete space MIMO architectures. Simulation codes will be provided at http://oa.ee.tsinghua.edu.cn/dailinglong/publications/publications.html
Published: 2024

16. Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Author: Bian, Jie and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory
Abstract: The Indexed Minimum Empirical Divergence (IMED) algorithm is a highly effective approach that offers a stronger theoretical guarantee of the asymptotic optimality compared to the Kullback--Leibler Upper Confidence Bound (KL-UCB) algorithm for the multi-armed bandit problem. Additionally, it has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. Despite its effectiveness, the generalization of this algorithm to contextual bandits with linear payoffs has remained elusive. In this paper, we present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms. We demonstrate that LinIMED provides a $\widetilde{O}(d\sqrt{T})$ upper regret bound where $d$ is the dimension of the context and $T$ is the time horizon. Furthermore, extensive empirical studies reveal that LinIMED and its variants outperform widely-used linear bandit algorithms such as LinUCB and Linear Thompson Sampling in some regimes., Comment: Accepted to the Transactions on Machine Learning Research (TMLR)
Published: 2024

17. LightningDrag: Lightning Fast and Accurate Drag-based Image Editing Emerging from Videos

Author: Shi, Yujun, Liew, Jun Hao, Yan, Hanshu, Tan, Vincent Y. F., and Feng, Jiashi
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Accuracy and speed are critical in image editing tasks. Pan et al. introduced a drag-based image editing framework that achieves pixel-level control using Generative Adversarial Networks (GANs). A flurry of subsequent studies enhanced this framework's generality by leveraging large-scale diffusion models. However, these methods often suffer from inordinately long processing times (exceeding 1 minute per edit) and low success rates. Addressing these issues head on, we present LightningDrag, a rapid approach enabling high quality drag-based image editing in ~1 second. Unlike most previous methods, we redefine drag-based editing as a conditional generation task, eliminating the need for time-consuming latent optimization or gradient-based guidance during inference. In addition, the design of our pipeline allows us to train our model on large-scale paired video frames, which contain rich motion information such as object translations, changing poses and orientations, zooming in and out, etc. By learning from videos, our approach can significantly outperform previous methods in terms of accuracy and consistency. Despite being trained solely on videos, our model generalizes well to perform local shape deformations not presented in the training data (e.g., lengthening of hair, twisting rainbows, etc.). Extensive qualitative and quantitative evaluations on benchmark datasets corroborate the superiority of our approach. The code and model will be released at https://github.com/magic-research/LightningDrag., Comment: Project page: https://lightning-drag.github.io/
Published: 2024

18. Adversarial Combinatorial Bandits with Switching Costs

Author: Dong, Yanyan and Tan, Vincent Y. F.
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We study the problem of adversarial combinatorial bandit with a switching cost $\lambda$ for a switch of each selected arm in each round, considering both the bandit feedback and semi-bandit feedback settings. In the oblivious adversarial case with $K$ base arms and time horizon $T$, we derive lower bounds for the minimax regret and design algorithms to approach them. To prove these lower bounds, we design stochastic loss sequences for both feedback settings, building on an idea from previous work in Dekel et al. (2014). The lower bound for bandit feedback is $ \tilde{\Omega}\big( (\lambda K)^{\frac{1}{3}} (TI)^{\frac{2}{3}}\big)$ while that for semi-bandit feedback is $ \tilde{\Omega}\big( (\lambda K I)^{\frac{1}{3}} T^{\frac{2}{3}}\big)$ where $I$ is the number of base arms in the combinatorial arm played in each round. To approach these lower bounds, we design algorithms that operate in batches by dividing the time horizon into batches to restrict the number of switches between actions. For the bandit feedback setting, where only the total loss of the combinatorial arm is observed, we introduce the Batched-Exp2 algorithm which achieves a regret upper bound of $\tilde{O}\big((\lambda K)^{\frac{1}{3}}T^{\frac{2}{3}}I^{\frac{4}{3}}\big)$ as $T$ tends to infinity. In the semi-bandit feedback setting, where all losses for the combinatorial arm are observed, we propose the Batched-BROAD algorithm which achieves a regret upper bound of $\tilde{O}\big( (\lambda K)^{\frac{1}{3}} (TI)^{\frac{2}{3}}\big)$., Comment: The work has been accepted in IEEE Transactions on Information Theory. https://ieeexplore.ieee.org/document/10487974
Published: 2024
Full Text: View/download PDF

19. Multi-Armed Bandits with Abstention

Author: Yang, Junwen, Jin, Tianyuan, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: We introduce a novel extension of the canonical multi-armed bandit problem that incorporates an additional strategic element: abstention. In this enhanced framework, the agent is not only tasked with selecting an arm at each time step, but also has the option to abstain from accepting the stochastic instantaneous reward before observing it. When opting for abstention, the agent either suffers a fixed regret or gains a guaranteed reward. Given this added layer of complexity, we ask whether we can develop efficient algorithms that are both asymptotically and minimax optimal. We answer this question affirmatively by designing and analyzing algorithms whose regrets meet their corresponding information-theoretic lower bounds. Our results offer valuable quantitative insights into the benefits of the abstention option, laying the groundwork for further exploration in other online decision-making problems with such an option. Numerical results further corroborate our theoretical findings., Comment: Preprint
Published: 2024

20. Variable-Length Feedback Codes over Known and Unknown Channels with Non-vanishing Error Probabilities

Author: Yavas, Recep Can and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory, E.4
Abstract: We study variable-length feedback (VLF) codes with noiseless feedback for discrete memoryless channels. We present a novel non-asymptotic bound, which analyzes the average error probability and average decoding time of our modified Yamamoto--Itoh scheme. We then optimize the parameters of our code in the asymptotic regime where the average error probability $\epsilon$ remains a constant as the average decoding time $N$ approaches infinity. Our second-order achievability bound is an improvement of Polyanskiy et al.'s (2011) achievability bound. We also universalize our code by employing the empirical mutual information in our decoding metric and derive a second-order achievability bound for universal VLF codes. Our results for both VLF and universal VLF codes are extended to the additive white Gaussian noise channel with an average power constraint. The former yields an improvement over Truong and Tan's (2017) achievability bound. The proof of our results for universal VLF codes uses a refined version of the method of types and an asymptotic expansion from the nonlinear renewal theory literature., Comment: Submitted to ISIT 2024, 14 pages
Published: 2024

21. Towards Accurate Guided Diffusion Sampling through Symplectic Adjoint Method

Author: Pan, Jiachun, Yan, Hanshu, Liew, Jun Hao, Feng, Jiashi, and Tan, Vincent Y. F.
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: Training-free guided sampling in diffusion models leverages off-the-shelf pre-trained networks, such as an aesthetic evaluation model, to guide the generation process. Current training-free guided sampling algorithms obtain the guidance energy function based on a one-step estimate of the clean image. However, since the off-the-shelf pre-trained networks are trained on clean images, the one-step estimation procedure of the clean image may be inaccurate, especially in the early stages of the generation process in diffusion models. This causes the guidance in the early time steps to be inaccurate. To overcome this problem, we propose Symplectic Adjoint Guidance (SAG), which calculates the gradient guidance in two inner stages. Firstly, SAG estimates the clean image via $n$ function calls, where $n$ serves as a flexible hyperparameter that can be tailored to meet specific image quality requirements. Secondly, SAG uses the symplectic adjoint method to obtain the gradients accurately and efficiently in terms of the memory requirements. Extensive experiments demonstrate that SAG generates images with higher qualities compared to the baselines in both guided image and video generation tasks.
Published: 2023

22. Fixed-Budget Best-Arm Identification in Sparse Linear Bandits

Author: Yavas, Recep Can and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning, I.2.6
Abstract: We study the best-arm identification problem in sparse linear bandits under the fixed-budget setting. In sparse linear bandits, the unknown feature vector $\theta^*$ may be of large dimension $d$, but only a few, say $s \ll d$ of these features have non-zero values. We design a two-phase algorithm, Lasso and Optimal-Design- (Lasso-OD) based linear best-arm identification. The first phase of Lasso-OD leverages the sparsity of the feature vector by applying the thresholded Lasso introduced by Zhou (2009), which estimates the support of $\theta^*$ correctly with high probability using rewards from the selected arms and a judicious choice of the design matrix. The second phase of Lasso-OD applies the OD-LinBAI algorithm by Yang and Tan (2022) on that estimated support. We derive a non-asymptotic upper bound on the error probability of Lasso-OD by carefully choosing hyperparameters (such as Lasso's regularization parameter) and balancing the error probabilities of both phases. For fixed sparsity $s$ and budget $T$, the exponent in the error probability of Lasso-OD depends on $s$ but not on the dimension $d$, yielding a significant performance improvement for sparse and high-dimensional linear bandits. Furthermore, we show that Lasso-OD is almost minimax optimal in the exponent. Finally, we provide numerical examples to demonstrate the significant performance improvement over the existing algorithms for non-sparse linear bandits such as OD-LinBAI, BayesGap, Peace, LinearExploration, and GSE., Comment: 28 pages, Submitted to TMLR
Published: 2023

23. Learning Regularized Graphon Mean-Field Games with Unknown Graphons

Author: Zhang, Fengzhuo, Tan, Vincent Y. F., Wang, Zhaoran, and Yang, Zhuoran
Subjects: Computer Science - Computer Science and Game Theory, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We design and analyze reinforcement learning algorithms for Graphon Mean-Field Games (GMFGs). In contrast to previous works that require the precise values of the graphons, we aim to learn the Nash Equilibrium (NE) of the regularized GMFGs when the graphons are unknown. Our contributions are threefold. First, we propose the Proximal Policy Optimization for GMFG (GMFG-PPO) algorithm and show that it converges at a rate of $O(T^{-1/3})$ after $T$ iterations with an estimation oracle, improving on a previous work by Xie et al. (ICML, 2021). Second, using kernel embedding of distributions, we design efficient algorithms to estimate the transition kernels, reward functions, and graphons from sampled agents. Convergence rates are then derived when the positions of the agents are either known or unknown. Results for the combination of the optimization algorithm GMFG-PPO and the estimation algorithm are then provided. These algorithms are the first specifically designed for learning graphons from sampled agents. Finally, the efficacy of the proposed algorithms are corroborated through simulations. These simulations demonstrate that learning the unknown graphons reduces the exploitability effectively.
Published: 2023

24. Provable Benefits of Multi-task RL under Non-Markovian Decision Making Processes

Author: Huang, Ruiquan, Cheng, Yuan, Yang, Jing, Tan, Vincent, and Liang, Yingbin
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In multi-task reinforcement learning (RL) under Markov decision processes (MDPs), the presence of shared latent structures among multiple MDPs has been shown to yield significant benefits to the sample efficiency compared to single-task RL. In this paper, we investigate whether such a benefit can extend to more general sequential decision making problems, such as partially observable MDPs (POMDPs) and more general predictive state representations (PSRs). The main challenge here is that the large and complex model space makes it hard to identify what types of common latent structure of multi-task PSRs can reduce the model complexity and improve sample efficiency. To this end, we posit a joint model class for tasks and use the notion of $\eta$-bracketing number to quantify its complexity; this number also serves as a general metric to capture the similarity of tasks and thus determines the benefit of multi-task over single-task RL. We first study upstream multi-task learning over PSRs, in which all tasks share the same observation and action spaces. We propose a provably efficient algorithm UMT-PSR for finding near-optimal policies for all PSRs, and demonstrate that the advantage of multi-task learning manifests if the joint model class of PSRs has a smaller $\eta$-bracketing number compared to that of individual single-task learning. We also provide several example multi-task PSRs with small $\eta$-bracketing numbers, which reap the benefits of multi-task learning. We further investigate downstream learning, in which the agent needs to learn a new target task that shares some commonalities with the upstream tasks via a similarity constraint. By exploiting the learned PSRs from the upstream, we develop a sample-efficient algorithm that provably finds a near-optimal policy.
Published: 2023

25. Optimal Best Arm Identification with Fixed Confidence in Restless Bandits

Author: Karthik, P. N., Tan, Vincent Y. F., Mukherjee, Arpan, and Tajer, Ali
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: We study best arm identification in a restless multi-armed bandit setting with finitely many arms. The discrete-time data generated by each arm forms a homogeneous Markov chain taking values in a common, finite state space. The state transitions in each arm are captured by an ergodic transition probability matrix (TPM) that is a member of a single-parameter exponential family of TPMs. The real-valued parameters of the arm TPMs are unknown and belong to a given space. Given a function $f$ defined on the common state space of the arms, the goal is to identify the best arm -- the arm with the largest average value of $f$ evaluated under the arm's stationary distribution -- with the fewest number of samples, subject to an upper bound on the decision's error probability (i.e., the fixed-confidence regime). A lower bound on the growth rate of the expected stopping time is established in the asymptote of a vanishing error probability. Furthermore, a policy for best arm identification is proposed, and its expected stopping time is proved to have an asymptotic growth rate that matches the lower bound. It is demonstrated that tracking the long-term behavior of a certain Markov decision process and its state-action visitation proportions are the key ingredients in analyzing the converse and achievability bounds. It is shown that under every policy, the state-action visitation proportions satisfy a specific approximate flow conservation constraint and that these proportions match the optimal proportions dictated by the lower bound under any asymptotically optimal policy. The prior studies on best arm identification in restless bandits focus on independent observations from the arms, rested Markov arms, and restless Markov arms with known arm TPMs. In contrast, this work is the first to study best arm identification in restless bandits with unknown arm TPMs., Comment: Accepted to the IEEE Transactions on Information Theory
Published: 2023

26. Optimal Private Discrete Distribution Estimation with One-bit Communication

Author: Nam, Seung-Hyun, Tan, Vincent Y. F., and Lee, Si-Hyeon
Subjects: Computer Science - Information Theory, Computer Science - Cryptography and Security
Abstract: We consider a private discrete distribution estimation problem with one-bit communication constraint. The privacy constraints are imposed with respect to the local differential privacy and the maximal leakage. The estimation error is quantified by the worst-case mean squared error. We completely characterize the first-order asymptotics of this privacy-utility trade-off under the one-bit communication constraint for both types of privacy constraints by using ideas from local asymptotic normality and the resolution of a block design mechanism. These results demonstrate the optimal dependence of the privacy-utility trade-off under the one-bit communication constraint in terms of the parameters of the privacy constraint and the size of the alphabet of the discrete distribution., Comment: 13 pages, 5 figures, and 1 page of supplementary material
Published: 2023

27. Learning Regularized Monotone Graphon Mean-Field Games

Author: Zhang, Fengzhuo, Tan, Vincent Y. F., Wang, Zhaoran, and Yang, Zhuoran
Subjects: Computer Science - Computer Science and Game Theory, Electrical Engineering and Systems Science - Systems and Control, Statistics - Machine Learning
Abstract: This paper studies two fundamental problems in regularized Graphon Mean-Field Games (GMFGs). First, we establish the existence of a Nash Equilibrium (NE) of any $\lambda$-regularized GMFG (for $\lambda\geq 0$). This result relies on weaker conditions than those in previous works for analyzing both unregularized GMFGs ($\lambda=0$) and $\lambda$-regularized MFGs, which are special cases of GMFGs. Second, we propose provably efficient algorithms to learn the NE in weakly monotone GMFGs, motivated by Lasry and Lions [2007]. Previous literature either only analyzed continuous-time algorithms or required extra conditions to analyze discrete-time algorithms. In contrast, we design a discrete-time algorithm and derive its convergence rate solely under weakly monotone conditions. Furthermore, we develop and analyze the action-value function estimation procedure during the online learning process, which is absent from algorithms for monotone GMFGs. This serves as a sub-module in our optimization algorithm. The efficiency of the designed algorithm is corroborated by empirical evaluations.
Published: 2023

28. Blink: Link Local Differential Privacy in Graph Neural Networks via Bayesian Estimation

Author: Zhu, Xiaochen, Tan, Vincent Y. F., and Xiao, Xiaokui
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Graph neural networks (GNNs) have gained an increasing amount of popularity due to their superior capability in learning node embeddings for various graph inference tasks, but training them can raise privacy concerns. To address this, we propose using link local differential privacy over decentralized nodes, enabling collaboration with an untrusted server to train GNNs without revealing the existence of any link. Our approach spends the privacy budget separately on links and degrees of the graph for the server to better denoise the graph topology using Bayesian estimation, alleviating the negative impact of LDP on the accuracy of the trained GNNs. We bound the mean absolute error of the inferred link probabilities against the ground truth graph topology. We then propose two variants of our LDP mechanism complementing each other in different privacy settings, one of which estimates fewer links under lower privacy budgets to avoid false positive link estimates when the uncertainty is high, while the other utilizes more information and performs better given relatively higher privacy budgets. Furthermore, we propose a hybrid variant that combines both strategies and is able to perform better across different privacy budgets. Extensive experiments show that our approach outperforms existing methods in terms of accuracy under varying privacy budgets., Comment: 17 pages, accepted by ACM CCS 2023 as a conference paper
Published: 2023

29. AdjointDPM: Adjoint Sensitivity Method for Gradient Backpropagation of Diffusion Probabilistic Models

Author: Pan, Jiachun, Liew, Jun Hao, Tan, Vincent Y. F., Feng, Jiashi, and Yan, Hanshu
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Existing customization methods require access to multiple reference examples to align pre-trained diffusion probabilistic models (DPMs) with user-provided concepts. This paper aims to address the challenge of DPM customization when the only available supervision is a differentiable metric defined on the generated contents. Since the sampling procedure of DPMs involves recursive calls to the denoising UNet, na\"ive gradient backpropagation requires storing the intermediate states of all iterations, resulting in extremely high memory consumption. To overcome this issue, we propose a novel method AdjointDPM, which first generates new samples from diffusion models by solving the corresponding probability-flow ODEs. It then uses the adjoint sensitivity method to backpropagate the gradients of the loss to the models' parameters (including conditioning signals, network weights, and initial noises) by solving another augmented ODE. To reduce numerical errors in both the forward generation and gradient backpropagation processes, we further reparameterize the probability-flow ODE and augmented ODE as simple non-stiff ODEs using exponential integration. Finally, we demonstrate the effectiveness of AdjointDPM on three interesting tasks: converting visual effects into identification text embeddings, finetuning DPMs for specific types of stylization, and optimizing initial noise to generate adversarial samples for security auditing.
Published: 2023

30. Deep Unrolling for Nonconvex Robust Principal Component Analysis

Author: Tan, Elizabeth Z. C., Chaux, Caroline, Soubies, Emmanuel, and Tan, Vincent Y. F.
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Machine Learning
Abstract: We design algorithms for Robust Principal Component Analysis (RPCA) which consists in decomposing a matrix into the sum of a low rank matrix and a sparse matrix. We propose a deep unrolled algorithm based on an accelerated alternating projection algorithm which aims to solve RPCA in its nonconvex form. The proposed procedure combines benefits of deep neural networks and the interpretability of the original algorithm and it automatically learns hyperparameters. We demonstrate the unrolled algorithm's effectiveness on synthetic datasets and also on a face modeling problem, where it leads to both better numerical and visual performances., Comment: 7 pages, 3 figures; Accepted to the 2023 IEEE International Workshop on Machine Learning for Signal Processing
Published: 2023

31. DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing

Author: Shi, Yujun, Xue, Chuhui, Liew, Jun Hao, Pan, Jiachun, Yan, Hanshu, Zhang, Wenqing, Tan, Vincent Y. F., and Bai, Song
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Machine Learning
Abstract: Accurate and controllable image editing is a challenging task that has attracted significant attention recently. Notably, DragGAN is an interactive point-based image editing framework that achieves impressive editing results with pixel-level precision. However, due to its reliance on generative adversarial networks (GANs), its generality is limited by the capacity of pretrained GAN models. In this work, we extend this editing framework to diffusion models and propose a novel approach DragDiffusion. By harnessing large-scale pretrained diffusion models, we greatly enhance the applicability of interactive point-based editing on both real and diffusion-generated images. Our approach involves optimizing the diffusion latents to achieve precise spatial control. The supervision signal of this optimization process is from the diffusion model's UNet features, which are known to contain rich semantic and geometric information. Moreover, we introduce two additional techniques, namely LoRA fine-tuning and latent-MasaCtrl, to further preserve the identity of the original image. Lastly, we present a challenging benchmark dataset called DragBench -- the first benchmark to evaluate the performance of interactive point-based image editing methods. Experiments across a wide range of challenging cases (e.g., images with multiple objects, diverse object categories, various styles, etc.) demonstrate the versatility and generality of DragDiffusion. Code: https://github.com/Yujun-Shi/DragDiffusion., Comment: Code is released at https://github.com/Yujun-Shi/DragDiffusion
Published: 2023

32. Communication-Constrained Bandits under Additive Gaussian Noise

Author: Mayekar, Prathamesh, Scarlett, Jonathan, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: We study a distributed stochastic multi-armed bandit where a client supplies the learner with communication-constrained feedback based on the rewards for the corresponding arm pulls. In our setup, the client must encode the rewards such that the second moment of the encoded rewards is no more than $P$, and this encoded reward is further corrupted by additive Gaussian noise of variance $\sigma^2$; the learner only has access to this corrupted reward. For this setting, we derive an information-theoretic lower bound of $\Omega\left(\sqrt{\frac{KT}{\mathtt{SNR} \wedge1}} \right)$ on the minimax regret of any scheme, where $ \mathtt{SNR} := \frac{P}{\sigma^2}$, and $K$ and $T$ are the number of arms and time horizon, respectively. Furthermore, we propose a multi-phase bandit algorithm, $\mathtt{UE\text{-}UCB++}$, which matches this lower bound to a minor additive factor. $\mathtt{UE\text{-}UCB++}$ performs uniform exploration in its initial phases and then utilizes the {\em upper confidence bound }(UCB) bandit algorithm in its final phase. An interesting feature of $\mathtt{UE\text{-}UCB++}$ is that the coarser estimates of the mean rewards formed during a uniform exploration phase help to refine the encoding protocol in the next phase, leading to more accurate mean estimates of the rewards in the subsequent phase. This positive reinforcement cycle is critical to reducing the number of uniform exploration rounds and closely matching our lower bound.
Published: 2023

33. Codes for Correcting $t$ Limited-Magnitude Sticky Deletions

Author: Wang, Shuche, Vu, Van Khu, and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory
Abstract: Codes for correcting sticky insertions/deletions and limited-magnitude errors have attracted significant attention due to their applications of flash memories, racetrack memories, and DNA data storage systems. In this paper, we first consider the error type of $t$-sticky deletions with $\ell$-limited-magnitude and propose a non-systematic code for correcting this type of error with redundancy $2t(1-1/p)\cdot\log(n+1)+O(1)$, where $p$ is the smallest prime larger than $\ell+1$. Next, we present a systematic code construction with an efficient encoding and decoding algorithm with redundancy $\frac{\lceil2t(1-1/p)\rceil\cdot\lceil\log p\rceil}{\log p} \log(n+1)+O(\log\log n)$, where $p$ is the smallest prime larger than $\ell+1$., Comment: arXiv admin note: substantial text overlap with arXiv:2301.11680
Published: 2023

34. Probably Anytime-Safe Stochastic Combinatorial Semi-Bandits

Author: Hou, Yunlong, Tan, Vincent Y. F., and Zhong, Zixin
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: Motivated by concerns about making online decisions that incur undue amount of risk at each time step, in this paper, we formulate the probably anytime-safe stochastic combinatorial semi-bandits problem. In this problem, the agent is given the option to select a subset of size at most $K$ from a set of $L$ ground items. Each item is associated to a certain mean reward as well as a variance that represents its risk. To mitigate the risk that the agent incurs, we require that with probability at least $1-\delta$, over the entire horizon of time $T$, each of the choices that the agent makes should contain items whose sum of variances does not exceed a certain variance budget. We call this probably anytime-safe constraint. Under this constraint, we design and analyze an algorithm {\sc PASCombUCB} that minimizes the regret over the horizon of time $T$. By developing accompanying information-theoretic lower bounds, we show that under both the problem-dependent and problem-independent paradigms, {\sc PASCombUCB} is almost asymptotically optimal. Experiments are conducted to corroborate our theoretical findings. Our problem setup, the proposed {\sc PASCombUCB} algorithm, and novel analyses are applicable to domains such as recommendation systems and transportation in which an agent is allowed to choose multiple items at a single time step and wishes to control the risk over the whole time horizon., Comment: To be presented at ICML 2023. 57 pages, 6 figures
Published: 2023

35. Codes for Correcting Asymmetric Adjacent Transpositions and Deletions

Author: Wang, Shuche, Vu, Van Khu, and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory
Abstract: Codes in the Damerau--Levenshtein metric have been extensively studied recently owing to their applications in DNA-based data storage. In particular, Gabrys, Yaakobi, and Milenkovic (2017) designed a length-$n$ code correcting a single deletion and $s$ adjacent transpositions with at most $(1+2s)\log n$ bits of redundancy. In this work, we consider a new setting where both asymmetric adjacent transpositions (also known as right-shifts or left-shifts) and deletions may occur. We present several constructions of the codes correcting these errors in various cases. In particular, we design a code correcting a single deletion, $s^+$ right-shift, and $s^-$ left-shift errors with at most $(1+s)\log (n+s+1)+1$ bits of redundancy where $s=s^{+}+s^{-}$. In addition, we investigate codes correcting $t$ $0$-deletions, $s^+$ right-shift, and $s^-$ left-shift errors with both uniquely-decoding and list-decoding algorithms. Our main contribution here is the construction of a list-decodable code with list size $O(n^{\min\{s+1,t\}})$ and with at most $(\max \{t,s+1\}) \log n+O(1)$ bits of redundancy, where $s=s^{+}+s^{-}$. Finally, we construct both non-systematic and systematic codes for correcting blocks of $0$-deletions with $\ell$-limited-magnitude and $s$ adjacent transpositions.
Published: 2023

36. Minimizing the Accumulated Trajectory Error to Improve Dataset Distillation

Author: Du, Jiawei, Jiang, Yidi, Tan, Vincent Y. F., Zhou, Joey Tianyi, and Li, Haizhou
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Model-based deep learning has achieved astounding successes due in part to the availability of large-scale real-world data. However, processing such massive amounts of data comes at a considerable cost in terms of computations, storage, training and the search for good neural architectures. Dataset distillation has thus recently come to the fore. This paradigm involves distilling information from large real-world datasets into tiny and compact synthetic datasets such that processing the latter ideally yields similar performances as the former. State-of-the-art methods primarily rely on learning the synthetic dataset by matching the gradients obtained during training between the real and synthetic data. However, these gradient-matching methods suffer from the so-called accumulated trajectory error caused by the discrepancy between the distillation and subsequent evaluation. To mitigate the adverse impact of this accumulated trajectory error, we propose a novel approach that encourages the optimization algorithm to seek a flat trajectory. We show that the weights trained on synthetic data are robust against the accumulated errors perturbations with the regularization towards the flat trajectory. Our method, called Flat Trajectory Distillation (FTD), is shown to boost the performance of gradient-matching methods by up to 4.7% on a subset of images of the ImageNet dataset with higher resolution images. We also validate the effectiveness and generalizability of our method with datasets of different resolutions and demonstrate its applicability to neural architecture search. Code is available at https://github.com/AngusDujw/FTD-distillation.
Published: 2022

37. Common Information, Noise Stability, and Their Extensions

Author: Yu, Lei and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory, Mathematics - Probability
Abstract: Common information (CI) is ubiquitous in information theory and related areas such as theoretical computer science and discrete probability. However, because there are multiple notions of CI, a unified understanding of the deep interconnections between them is lacking. This monograph seeks to fill this gap by leveraging a small set of mathematical techniques that are applicable across seemingly disparate problems. In Part I, we review the operational tasks and properties associated with Wyner's and G\'acs-K\"orner-Witsenhausen's (GKW's) CI. In PartII, we discuss extensions of the former from the perspective of distributed source simulation. This includes the R\'enyi CI which forms a bridge between Wyner's CI and the exact CI. Via a surprising equivalence between the R\'enyi CI of order~$\infty$ and the exact CI, we demonstrate the existence of a joint source in which the exact CI strictly exceeds Wyner's CI. Other closely related topics discussed in Part II include the channel synthesis problem and the connection of Wyner's and exact CI to the nonnegative rank of matrices. In Part III, we examine GKW's CI with a more refined lens via the noise stability or NICD problem in which we quantify the agreement probability of extracted bits from a bivariate source. We then extend this to the $k$-user NICD and $q$-stability problems, and discuss various conjectures in information theory and discrete probability, such as the Courtade-Kumar, Li-M\'edard and Mossell-O'Donnell conjectures. Finally, we consider hypercontractivity and Brascamp-Lieb inequalities, which further generalize noise stability via replacing the Boolean functions therein by nonnnegative functions. The key ideas behind the proofs in Part III can be presented in a pedagogically coherent manner and unified via information-theoretic and Fourier-analytic methods., Comment: Lei Yu and Vincent Y. F. Tan (2022), "Common Information, Noise Stability, and Their Extensions'', Foundations and Trends in Communications and Information Theory: Vol. 19, No. 3, pp 264--546. DOI: 10.1561/0100000122
Published: 2022

38. Fast Beam Alignment via Pure Exploration in Multi-armed Bandits

Author: Wei, Yi, Zhong, Zixin, and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: The beam alignment (BA) problem consists in accurately aligning the transmitter and receiver beams to establish a reliable communication link in wireless communication systems. Existing BA methods search the entire beam space to identify the optimal transmit-receive beam pair. This incurs a significant latency when the number of antennas is large. In this work, we develop a bandit-based fast BA algorithm to reduce BA latency for millimeter-wave (mmWave) communications. Our algorithm is named Two-Phase Heteroscedastic Track-and-Stop (2PHT\&S). We first formulate the BA problem as a pure exploration problem in multi-armed bandits in which the objective is to minimize the required number of time steps given a certain fixed confidence level. By taking advantage of the correlation structure among beams that the information from nearby beams is similar and the heteroscedastic property that the variance of the reward of an arm (beam) is related to its mean, the proposed algorithm groups all beams into several beam sets such that the optimal beam set is first selected and the optimal beam is identified in this set after that. Theoretical analysis and simulation results on synthetic and semi-practical channel data demonstrate the clear superiority of the proposed algorithm vis-\`a-vis other baseline competitors., Comment: 16 pages, 9 figures; Accepted to the IEEE Transactions on Wireless Communications
Published: 2022

39. How Does Pseudo-Labeling Affect the Generalization Error of the Semi-Supervised Gibbs Algorithm?

Author: He, Haiyun, Aminian, Gholamali, Bu, Yuheng, Rodrigues, Miguel, and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: We provide an exact characterization of the expected generalization error (gen-error) for semi-supervised learning (SSL) with pseudo-labeling via the Gibbs algorithm. The gen-error is expressed in terms of the symmetrized KL information between the output hypothesis, the pseudo-labeled dataset, and the labeled dataset. Distribution-free upper and lower bounds on the gen-error can also be obtained. Our findings offer new insights that the generalization performance of SSL with pseudo-labeling is affected not only by the information between the output hypothesis and input training data but also by the information {\em shared} between the {\em labeled} and {\em pseudo-labeled} data samples. This serves as a guideline to choose an appropriate pseudo-labeling method from a given family of methods. To deepen our understanding, we further explore two examples -- mean estimation and logistic regression. In particular, we analyze how the ratio of the number of unlabeled to labeled data $\lambda$ affects the gen-error under both scenarios. As $\lambda$ increases, the gen-error for mean estimation decreases and then saturates at a value larger than when all the samples are labeled, and the gap can be quantified {\em exactly} with our analysis, and is dependent on the \emph{cross-covariance} between the labeled and pseudo-labeled data samples. For logistic regression, the gen-error and the variance component of the excess risk also decrease as $\lambda$ increases., Comment: 30 pages, 4 figures
Published: 2022

40. Federated Best Arm Identification with Heterogeneous Clients

Author: Chen, Zhirui, Karthik, P. N., Tan, Vincent Y. F., and Chee, Yeow Meng
Subjects: Computer Science - Machine Learning, Mathematics - Statistics Theory
Abstract: We study best arm identification in a federated multi-armed bandit setting with a central server and multiple clients, when each client has access to a {\em subset} of arms and each arm yields independent Gaussian observations. The goal is to identify the best arm of each client subject to an upper bound on the error probability; here, the best arm is one that has the largest {\em average} value of the means averaged across all clients having access to the arm. Our interest is in the asymptotics as the error probability vanishes. We provide an asymptotic lower bound on the growth rate of the expected stopping time of any algorithm. Furthermore, we show that for any algorithm whose upper bound on the expected stopping time matches with the lower bound up to a multiplicative constant ({\em almost-optimal} algorithm), the ratio of any two consecutive communication time instants must be {\em bounded}, a result that is of independent interest. We thereby infer that an algorithm can communicate no more sparsely than at exponential time instants in order to be almost-optimal. For the class of almost-optimal algorithms, we present the first-of-its-kind asymptotic lower bound on the expected number of {\em communication rounds} until stoppage. We propose a novel algorithm that communicates at exponential time instants, and demonstrate that it is asymptotically almost-optimal.
Published: 2022
Full Text: View/download PDF

41. Towards Understanding and Mitigating Dimensional Collapse in Heterogeneous Federated Learning

Author: Shi, Yujun, Liang, Jian, Zhang, Wenqing, Tan, Vincent Y. F., and Bai, Song
Subjects: Computer Science - Machine Learning
Abstract: Federated learning aims to train models collaboratively across different clients without the sharing of data for privacy considerations. However, one major challenge for this learning paradigm is the {\em data heterogeneity} problem, which refers to the discrepancies between the local data distributions among various clients. To tackle this problem, we first study how data heterogeneity affects the representations of the globally aggregated models. Interestingly, we find that heterogeneous data results in the global model suffering from severe {\em dimensional collapse}, in which representations tend to reside in a lower-dimensional space instead of the ambient space. Moreover, we observe a similar phenomenon on models locally trained on each client and deduce that the dimensional collapse on the global model is inherited from local models. In addition, we theoretically analyze the gradient flow dynamics to shed light on how data heterogeneity result in dimensional collapse for local models. To remedy this problem caused by the data heterogeneity, we propose {\sc FedDecorr}, a novel method that can effectively mitigate dimensional collapse in federated learning. Specifically, {\sc FedDecorr} applies a regularization term during local training that encourages different dimensions of representations to be uncorrelated. {\sc FedDecorr}, which is implementation-friendly and computationally-efficient, yields consistent improvements over baselines on standard benchmark datasets. Code: https://github.com/bytedance/FedDecorr., Comment: camera ready version of ICLR 2023
Published: 2022

42. Relational Reasoning via Set Transformers: Provable Efficiency and Applications to MARL

Author: Zhang, Fengzhuo, Liu, Boyi, Wang, Kaixin, Tan, Vincent Y. F., Yang, Zhuoran, and Wang, Zhaoran
Subjects: Computer Science - Machine Learning, Computer Science - Multiagent Systems, Statistics - Machine Learning
Abstract: The cooperative Multi-A gent R einforcement Learning (MARL) with permutation invariant agents framework has achieved tremendous empirical successes in real-world applications. Unfortunately, the theoretical understanding of this MARL problem is lacking due to the curse of many agents and the limited exploration of the relational reasoning in existing works. In this paper, we verify that the transformer implements complex relational reasoning, and we propose and analyze model-free and model-based offline MARL algorithms with the transformer approximators. We prove that the suboptimality gaps of the model-free and model-based algorithms are independent of and logarithmic in the number of agents respectively, which mitigates the curse of many agents. These results are consequences of a novel generalization error bound of the transformer and a novel analysis of the Maximum Likelihood Estimate (MLE) of the system dynamics with the transformer. Our model-based algorithm is the first provably efficient MARL algorithm that explicitly exploits the permutation invariance of the agents. Our improved generalization bound may be of independent interest and is applicable to other regression problems related to the transformer beyond MARL.
Published: 2022

43. Almost Cost-Free Communication in Federated Best Arm Identification

Author: Reddy, Kota Srinivas, Karthik, P. N., and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: We study the problem of best arm identification in a federated learning multi-armed bandit setup with a central server and multiple clients. Each client is associated with a multi-armed bandit in which each arm yields {\em i.i.d.}\ rewards following a Gaussian distribution with an unknown mean and known variance. The set of arms is assumed to be the same at all the clients. We define two notions of best arm -- local and global. The local best arm at a client is the arm with the largest mean among the arms local to the client, whereas the global best arm is the arm with the largest average mean across all the clients. We assume that each client can only observe the rewards from its local arms and thereby estimate its local best arm. The clients communicate with a central server on uplinks that entail a cost of $C\ge0$ units per usage per uplink. The global best arm is estimated at the server. The goal is to identify the local best arms and the global best arm with minimal total cost, defined as the sum of the total number of arm selections at all the clients and the total communication cost, subject to an upper bound on the error probability. We propose a novel algorithm {\sc FedElim} that is based on successive elimination and communicates only in exponential time steps and obtain a high probability instance-dependent upper bound on its total cost. The key takeaway from our paper is that for any $C\geq 0$ and error probabilities sufficiently small, the total number of arm selections (resp.\ the total cost) under {\sc FedElim} is at most~$2$ (resp.~$3$) times the maximum total number of arm selections under its variant that communicates in every time step. Additionally, we show that the latter is optimal in expectation up to a constant factor, thereby demonstrating that communication is almost cost-free in {\sc FedElim}. We numerically validate the efficacy of {\sc FedElim}., Comment: Accepted to AAAI 2023
Published: 2022

44. Asymptotic Nash Equilibrium for the $M$-ary Sequential Adversarial Hypothesis Testing Game

Author: Pan, Jiachun, Li, Yonglong, and Tan, Vincent Y. F.
Subjects: Computer Science - Information Theory, Electrical Engineering and Systems Science - Signal Processing
Abstract: In this paper, we consider a novel $M$-ary sequential hypothesis testing problem in which an adversary is present and perturbs the distributions of the samples before the decision maker observes them. This problem is formulated as a sequential adversarial hypothesis testing game played between the decision maker and the adversary. This game is a zero-sum and strategic one. We assume the adversary is active under \emph{all} hypotheses and knows the underlying distribution of observed samples. We adopt this framework as it is the worst-case scenario from the perspective of the decision maker. The goal of the decision maker is to minimize the expectation of the stopping time to ensure that the test is as efficient as possible; the adversary's goal is, instead, to maximize the stopping time. We derive a pair of strategies under which the asymptotic Nash equilibrium of the game is attained. We also consider the case in which the adversary is not aware of the underlying hypothesis and hence is constrained to apply the same strategy regardless of which hypothesis is in effect. Numerical results corroborate our theoretical findings., Comment: The paper was presented in part at the 2022 International Symposium on Information Theory (ISIT). It has been submitted to IEEE Transactions on Information Forensics and Security
Published: 2022

45. Sharpness-Aware Training for Free

Author: Du, Jiawei, Zhou, Daquan, Feng, Jiashi, Tan, Vincent Y. F., and Zhou, Joey Tianyi
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Computer Vision and Pattern Recognition
Abstract: Modern deep neural networks (DNNs) have achieved state-of-the-art performances but are typically over-parameterized. The over-parameterization may result in undesirably large generalization error in the absence of other customized training strategies. Recently, a line of research under the name of Sharpness-Aware Minimization (SAM) has shown that minimizing a sharpness measure, which reflects the geometry of the loss landscape, can significantly reduce the generalization error. However, SAM-like methods incur a two-fold computational overhead of the given base optimizer (e.g. SGD) for approximating the sharpness measure. In this paper, we propose Sharpness-Aware Training for Free, or SAF, which mitigates the sharp landscape at almost zero additional computational cost over the base optimizer. Intuitively, SAF achieves this by avoiding sudden drops in the loss in the sharp local minima throughout the trajectory of the updates of the weights. Specifically, we suggest a novel trajectory loss, based on the KL-divergence between the outputs of DNNs with the current weights and past weights, as a replacement of the SAM's sharpness measure. This loss captures the rate of change of the training loss along the model's update trajectory. By minimizing it, SAF ensures the convergence to a flat minimum with improved generalization capabilities. Extensive empirical results show that SAF minimizes the sharpness in the same way that SAM does, yielding better results on the ImageNet dataset with essentially the same computational cost as the base optimizer.
Published: 2022

46. A Survey of Risk-Aware Multi-Armed Bandits

Author: Tan, Vincent Y. F., A., Prashanth L., and Jagannathan, Krishna
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: In several applications such as clinical trials and financial portfolio optimization, the expected value (or the average reward) does not satisfactorily capture the merits of a drug or a portfolio. In such applications, risk plays a crucial role, and a risk-aware performance measure is preferable, so as to capture losses in the case of adverse events. This survey aims to consolidate and summarise the existing research on risk measures, specifically in the context of multi-armed bandits. We review various risk measures of interest, and comment on their properties. Next, we review existing concentration inequalities for various risk measures. Then, we proceed to defining risk-aware bandit problems, We consider algorithms for the regret minimization setting, where the exploration-exploitation trade-off manifests, as well as the best-arm identification setting, which is a pure exploration problem -- both in the context of risk-sensitive measures. We conclude by commenting on persisting challenges and fertile areas for future research., Comment: 11 pages; Unabridged version of a a survey paper of the same title accepted to IJCAI-ECAI, 2022
Published: 2022

47. A review of efficient thermal application for ice detection and anti/de-icing technology

Author: Li, Qingying, Yao, Rao, Tan, Vincent Beng Chye, He, Fajiang, Zhao, Huanyu, and Bai, Tian
Published: 2025
Full Text: View/download PDF

48. Best Arm Identification in Restless Markov Multi-Armed Bandits

Author: Karthik, P. N., Reddy, Kota Srinivas, and Tan, Vincent Y. F.
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: We study the problem of identifying the best arm in a multi-armed bandit environment when each arm is a time-homogeneous and ergodic discrete-time Markov process on a common, finite state space. The state evolution on each arm is governed by the arm's transition probability matrix (TPM). A decision entity that knows the set of arm TPMs but not the exact mapping of the TPMs to the arms, wishes to find the index of the best arm as quickly as possible, subject to an upper bound on the error probability. The decision entity selects one arm at a time sequentially, and all the unselected arms continue to undergo state evolution ({\em restless} arms). For this problem, we derive the first-known problem instance-dependent asymptotic lower bound on the growth rate of the expected time required to find the index of the best arm, where the asymptotics is as the error probability vanishes. Further, we propose a sequential policy that, for an input parameter $R$, forcibly selects an arm that has not been selected for $R$ consecutive time instants. We show that this policy achieves an upper bound that depends on $R$ and is monotonically non-increasing as $R\to\infty$. The question of whether, in general, the limiting value of the upper bound as $R\to\infty$ matches with the lower bound, remains open. We identify a special case in which the upper and the lower bounds match. Prior works on best arm identification have dealt with (a) independent and identically distributed observations from the arms, and (b) rested Markov arms, whereas our work deals with the more difficult setting of restless Markov arms., Comment: 41 pages
Published: 2022

49. Canonical Portfolios: Optimal Asset and Signal Combination

Author: Firoozye, Nikan, Tan, Vincent, and Zohren, Stefan
Subjects: Quantitative Finance - Portfolio Management
Abstract: This paper presents a novel framework for analyzing the optimal asset and signal combination problem. Our approach builds upon the dynamic portfolio selection problem introduced by Brandt and Santa-Clara (2006) and consists of two stages. First, we reformulate their original investment problem into a tractable one that allows us to derive a closed-form expression for the optimal portfolio policy that is scalable to large cross-sectional financial applications. Second, we recast the problem of selecting a portfolio of correlated assets and signals into selecting a set of uncorrelated managed portfolios through the lens of Canonical Correlation Analysis of Hotelling (1936). The new investment environment of uncorrelated managed portfolios offers unique economic insights into the joint correlation structure of our optimal portfolio policy. We also operationalize our theoretical framework to bridge the gap between theory and practice, showcasing the improved performance of our proposed method over natural competing benchmarks.
Published: 2022

50. Optimal Clustering with Bandit Feedback

Author: Yang, Junwen, Zhong, Zixin, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: This paper considers the problem of online clustering with bandit feedback. A set of arms (or items) can be partitioned into various groups that are unknown. Within each group, the observations associated to each of the arms follow the same distribution with the same mean vector. At each time step, the agent queries or pulls an arm and obtains an independent observation from the distribution it is associated to. Subsequent pulls depend on previous ones as well as the previously obtained samples. The agent's task is to uncover the underlying partition of the arms with the least number of arm pulls and with a probability of error not exceeding a prescribed constant $\delta$. The problem proposed finds numerous applications from clustering of variants of viruses to online market segmentation. We present an instance-dependent information-theoretic lower bound on the expected sample complexity for this task, and design a computationally efficient and asymptotically optimal algorithm, namely Bandit Online Clustering (BOC). The algorithm includes a novel stopping rule for adaptive sequential testing that circumvents the need to exactly solve any NP-hard weighted clustering problem as its subroutines. We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower bound asymptotically, and significantly outperforms a non-adaptive baseline algorithm., Comment: 54 pages, 4 figures
Published: 2022

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

1,459 results on '"Tan, Vincent"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources