Author: "Jin, Tianyuan" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jin, Tianyuan"' showing total 19 results

Start Over Author "Jin, Tianyuan"

19 results on '"Jin, Tianyuan"'

1. Best Arm Identification with Minimal Regret

Author: Yang, Junwen, Tan, Vincent Y. F., and Jin, Tianyuan
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: Motivated by real-world applications that necessitate responsible experimentation, we introduce the problem of best arm identification (BAI) with minimal regret. This innovative variant of the multi-armed bandit problem elegantly amalgamates two of its most ubiquitous objectives: regret minimization and BAI. More precisely, the agent's goal is to identify the best arm with a prescribed confidence level $\delta$, while minimizing the cumulative regret up to the stopping time. Focusing on single-parameter exponential families of distributions, we leverage information-theoretic techniques to establish an instance-dependent lower bound on the expected cumulative regret. Moreover, we present an intriguing impossibility result that underscores the tension between cumulative regret and sample complexity in fixed-confidence BAI. Complementarily, we design and analyze the Double KL-UCB algorithm, which achieves asymptotic optimality as the confidence level tends to zero. Notably, this algorithm employs two distinct confidence bounds to guide arm selection in a randomized manner. Our findings elucidate a fresh perspective on the inherent connections between regret minimization and BAI., Comment: Preprint
Published: 2024

2. Optimal Batched Linear Bandits

Author: Ren, Xuanfei, Jin, Tianyuan, and Xu, Pan
Subjects: Computer Science - Machine Learning, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: We introduce the E$^4$ algorithm for the batched linear bandit problem, incorporating an Explore-Estimate-Eliminate-Exploit framework. With a proper choice of exploration rate, we prove E$^4$ achieves the finite-time minimax optimal regret with only $O(\log\log T)$ batches, and the asymptotically optimal regret with only $3$ batches as $T\rightarrow\infty$, where $T$ is the time horizon. We further prove a lower bound on the batch complexity of linear contextual bandits showing that any asymptotically optimal algorithm must require at least $3$ batches in expectation as $T\rightarrow\infty$, which indicates E$^4$ achieves the asymptotic optimality in regret and batch complexity simultaneously. To the best of our knowledge, E$^4$ is the first algorithm for linear bandits that simultaneously achieves the minimax and asymptotic optimality in regret with the corresponding optimal batch complexities. In addition, we show that with another choice of exploration rate E$^4$ achieves an instance-dependent regret bound requiring at most $O(\log T)$ batches, and maintains the minimax optimality and asymptotic optimality. We conduct thorough experiments to evaluate our algorithm on randomly generated instances and the challenging \textit{End of Optimism} instances \citep{lattimore2017end} which were shown to be hard to learn for optimism based algorithms. Empirical results show that E$^4$ consistently outperforms baseline algorithms with respect to regret minimization, batch complexity, and computational efficiency., Comment: 26 pages, 6 figures, 4 tables. To appear in the proceedings of the 41st International Conference on Machine Learning (ICML 2024)
Published: 2024

3. Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

Author: Jin, Tianyuan, Jang, Kyoungseok, and Cesa-Bianchi, Nicolò
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study stochastic linear bandits where, in each round, the learner receives a set of actions (i.e., feature vectors), from which it chooses an element and obtains a stochastic reward. The expected reward is a fixed but unknown linear function of the chosen action. We study sparse regret bounds, that depend on the number $S$ of non-zero coefficients in the linear reward function. Previous works focused on the case where $S$ is known, or the action sets satisfy additional assumptions. In this work, we obtain the first sparse regret bounds that hold when $S$ is unknown and the action sets are adversarially generated. Our techniques combine online to confidence set conversions with a novel randomized model selection approach over a hierarchy of nested confidence sets. When $S$ is known, our analysis recovers state-of-the-art bounds for adversarial action sets. We also show that a variant of our approach, using Exp3 to dynamically select the confidence sets, can be used to improve the empirical performance of stochastic linear bandits while enjoying a regret bound with optimal dependence on the time horizon., Comment: 25 pages
Published: 2024

4. Multi-Armed Bandits with Abstention

Author: Yang, Junwen, Jin, Tianyuan, and Tan, Vincent Y. F.
Subjects: Computer Science - Machine Learning, Computer Science - Information Theory, Statistics - Machine Learning
Abstract: We introduce a novel extension of the canonical multi-armed bandit problem that incorporates an additional strategic element: abstention. In this enhanced framework, the agent is not only tasked with selecting an arm at each time step, but also has the option to abstain from accepting the stochastic instantaneous reward before observing it. When opting for abstention, the agent either suffers a fixed regret or gains a guaranteed reward. Given this added layer of complexity, we ask whether we can develop efficient algorithms that are both asymptotically and minimax optimal. We answer this question affirmatively by designing and analyzing algorithms whose regrets meet their corresponding information-theoretic lower bounds. Our results offer valuable quantitative insights into the benefits of the abstention option, laying the groundwork for further exploration in other online decision-making problems with such an option. Numerical results further corroborate our theoretical findings., Comment: Preprint
Published: 2024

5. Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs

Author: Jin, Tianyuan, Hsu, Hao-Lun, Chang, William, and Xu, Pan
Subjects: Computer Science - Machine Learning, Computer Science - Multiagent Systems, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: We study the multi-agent multi-armed bandit (MAMAB) problem, where $m$ agents are factored into $\rho$ overlapping groups. Each group represents a hyperedge, forming a hypergraph over the agents. At each round of interaction, the learner pulls a joint arm (composed of individual arms for each agent) and receives a reward according to the hypergraph structure. Specifically, we assume there is a local reward for each hyperedge, and the reward of the joint arm is the sum of these local rewards. Previous work introduced the multi-agent Thompson sampling (MATS) algorithm \citep{verstraeten2020multiagent} and derived a Bayesian regret bound. However, it remains an open problem how to derive a frequentist regret bound for Thompson sampling in this multi-agent setting. To address these issues, we propose an efficient variant of MATS, the $\epsilon$-exploring Multi-Agent Thompson Sampling ($\epsilon$-MATS) algorithm, which performs MATS exploration with probability $\epsilon$ while adopts a greedy policy otherwise. We prove that $\epsilon$-MATS achieves a worst-case frequentist regret bound that is sublinear in both the time horizon and the local arm size. We also derive a lower bound for this setting, which implies our frequentist regret upper bound is optimal up to constant and logarithm terms, when the hypergraph is sufficiently sparse. Thorough experiments on standard MAMAB problems demonstrate the superior performance and the improved computational efficiency of $\epsilon$-MATS compared with existing algorithms in the same setting., Comment: 22 pages, 7 figures, 2 tables. To appear in the proceedings of the 38th Annual AAAI Conference on Artificial Intelligence (AAAI'2024)
Published: 2023

6. Optimal Batched Best Arm Identification

Author: Jin, Tianyuan, Yang, Yu, Tang, Jing, Xiao, Xiaokui, and Xu, Pan
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study the batched best arm identification (BBAI) problem, where the learner's goal is to identify the best arm while switching the policy as less as possible. In particular, we aim to find the best arm with probability $1-\delta$ for some small constant $\delta>0$ while minimizing both the sample complexity (total number of arm pulls) and the batch complexity (total number of batches). We propose the three-batch best arm identification (Tri-BBAI) algorithm, which is the first batched algorithm that achieves the optimal sample complexity in the asymptotic setting (i.e., $\delta\rightarrow 0$) and runs only in at most $3$ batches. Based on Tri-BBAI, we further propose the almost optimal batched best arm identification (Opt-BBAI) algorithm, which is the first algorithm that achieves the near-optimal sample and batch complexity in the non-asymptotic setting (i.e., $\delta>0$ is arbitrarily fixed), while enjoying the same batch and sample complexity as Tri-BBAI when $\delta$ tends to zero. Moreover, in the non-asymptotic setting, the complexity of previous batch algorithms is usually conditioned on the event that the best arm is returned (with a probability of at least $1-\delta$), which is potentially unbounded in cases where a sub-optimal arm is returned. In contrast, the complexity of Opt-BBAI does not rely on such an event. This is achieved through a novel procedure that we design for checking whether the best arm is eliminated, which is of independent interest., Comment: 32 pages, 1 figure, 3 tables
Published: 2023

7. Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

Author: Jin, Tianyuan, Xu, Pan, Xiao, Xiaokui, and Anandkumar, Anima
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We study the regret of Thompson sampling (TS) algorithms for exponential family bandits, where the reward distribution is from a one-dimensional exponential family, which covers many common reward distributions including Bernoulli, Gaussian, Gamma, Exponential, etc. We propose a Thompson sampling algorithm, termed ExpTS, which uses a novel sampling distribution to avoid the under-estimation of the optimal arm. We provide a tight regret analysis for ExpTS, which simultaneously yields both the finite-time regret bound as well as the asymptotic regret bound. In particular, for a $K$-armed bandit with exponential family rewards, ExpTS over a horizon $T$ is sub-UCB (a strong criterion for the finite-time regret that is problem-dependent), minimax optimal up to a factor $\sqrt{\log K}$, and asymptotically optimal, for exponential family rewards. Moreover, we propose ExpTS$^+$, by adding a greedy exploitation step in addition to the sampling distribution used in ExpTS, to avoid the over-estimation of sub-optimal arms. ExpTS$^+$ is an anytime bandit algorithm and achieves the minimax optimality and asymptotic optimality simultaneously for exponential family reward distributions. Our proof techniques are general and conceptually simple and can be easily applied to analyze standard Thompson sampling with specific reward distributions., Comment: 49 pages
Published: 2022

8. MOTS: Minimax Optimal Thompson Sampling

Author: Jin, Tianyuan, Xu, Pan, Shi, Jieming, Xiao, Xiaokui, and Gu, Quanquan
Subjects: Computer Science - Machine Learning, Mathematics - Statistics Theory, Statistics - Machine Learning
Abstract: Thompson sampling is one of the most widely used algorithms for many online decision problems, due to its simplicity in implementation and superior empirical performance over other state-of-the-art methods. Despite its popularity and empirical success, it has remained an open problem whether Thompson sampling can match the minimax lower bound $\Omega(\sqrt{KT})$ for $K$-armed bandit problems, where $T$ is the total time horizon. In this paper, we solve this long open problem by proposing a variant of Thompson sampling called MOTS that adaptively clips the sampling instance of the chosen arm at each time step. We prove that this simple variant of Thompson sampling achieves the minimax optimal regret bound $O(\sqrt{KT})$ for finite time horizon $T$, as well as the asymptotic optimal regret bound for Gaussian rewards when $T$ approaches infinity. To our knowledge, MOTS is the first Thompson sampling type algorithm that achieves the minimax optimality for multi-armed bandit problems., Comment: 27 pages, 1 table, 2 figures. This version improves the presentation in V2
Published: 2020

9. Double Explore-then-Commit: Asymptotic Optimality and Beyond

Author: Jin, Tianyuan, Xu, Pan, Xiao, Xiaokui, and Gu, Quanquan
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: We study the multi-armed bandit problem with subgaussian rewards. The explore-then-commit (ETC) strategy, which consists of an exploration phase followed by an exploitation phase, is one of the most widely used algorithms in a variety of online decision applications. Nevertheless, it has been shown in Garivier et al. (2016) that ETC is suboptimal in the asymptotic sense as the horizon grows, and thus, is worse than fully sequential strategies such as Upper Confidence Bound (UCB). In this paper, we show that a variant of ETC algorithm can actually achieve the asymptotic optimality for multi-armed bandit problems as UCB-type algorithms do and extend it to the batched bandit setting. Specifically, we propose a double explore-then-commit (DETC) algorithm that has two exploration and exploitation phases and prove that DETC achieves the asymptotically optimal regret bound. To our knowledge, DETC is the first non-fully-sequential algorithm that achieves such asymptotic optimality. In addition, we extend DETC to batched bandit problems, where (i) the exploration process is split into a small number of batches and (ii) the round complexity is of central interest. We prove that a batched version of DETC can achieve the asymptotic optimality with only a constant round complexity. This is the first batched bandit algorithm that can attain the optimal asymptotic regret bound and optimal round complexity simultaneously., Comment: 46 pages. This version improves the presentation, and adds new algorithms and theoretical results: an anytime algorithm with asymptotic optimality guarantee, and an extension to K-armed bandits
Published: 2020

10. Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs

Author: Shi, Jieming, Jin, Tianyuan, Yang, Renchi, Xiao, Xiaokui, and Yang, Yin
Subjects: Computer Science - Databases, Computer Science - Artificial Intelligence, Computer Science - Social and Information Networks, H.2
Abstract: Given a graph G and a node u in G, a single source SimRank query evaluates the similarity between u and every node v in G. Existing approaches to single source SimRank computation incur either long query response time, or expensive pre-computation, which needs to be performed again whenever the graph G changes. Consequently, to our knowledge none of them is ideal for scenarios in which (i) query processing must be done in realtime, and (ii) the underlying graph G is massive, with frequent updates. Motivated by this, we propose SimPush, a novel algorithm that answers single source SimRank queries without any pre-computation, and at the same time achieves significantly higher query processing speed than even the fastest known index-based solutions. Further, SimPush provides rigorous result quality guarantees, and its high performance does not rely on any strong assumption of the underlying graph. Specifically, compared to existing methods, SimPush employs a radically different algorithmic design that focuses on (i) identifying a small number of nodes relevant to the query, and subsequently (ii) computing statistics and performing residue push from these nodes only. We prove the correctness of SimPush, analyze its time complexity, and compare its asymptotic performance with that of existing methods. Meanwhile, we evaluate the practical performance of SimPush through extensive experiments on 8 real datasets. The results demonstrate that SimPush consistently outperforms all existing solutions, often by over an order of magnitude. In particular, on a commodity machine, SimPush answers a single source SimRank query on a web graph containing over 133 million nodes and 5.4 billion edges in under 62 milliseconds, with 0.00035 empirical error, while the fastest index-based competitor needs 1.18 seconds., Comment: To appear in PVLDB 2020
Published: 2020

11. Tracking Top-K Influential Vertices in Dynamic Networks

Author: Yang, Yu, Wang, Zhefeng, Jin, Tianyuan, Pei, Jian, and Chen, Enhong
Subjects: Computer Science - Social and Information Networks
Abstract: Influence propagation in networks has enjoyed fruitful applications and has been extensively studied in literature. However, only very limited preliminary studies tackled the challenges in handling highly dynamic changes in real networks. In this paper, we tackle the problem of tracking top-$k$ influential vertices in dynamic networks, where the dynamic changes are modeled as a stream of edge weight updates. Under the popularly adopted linear threshold (LT) model and the independent cascade (IC) model, we address two essential versions of the problem: tracking the top-$k$ influential individuals and finding the best $k$-seed set to maximize the influence spread (Influence Maximization). We adopt the polling-based method and maintain a sample of random RR sets so that we can approximate the influence of vertices with provable quality guarantees. It is known that updating RR sets over dynamic changes of a network can be easily done by a reservoir sampling method, so the key challenge is to efficiently decide how many RR sets are needed to achieve good quality guarantees. We use two simple signals, which both can be accessed in $O(1)$ time, to decide a proper number of RR sets. We prove the effectiveness of our methods. For both tasks the error incurred in our method is only a multiplicative factor to the ground truth. For influence maximization, we also propose an efficient query algorithm for finding the $k$ seeds, which is one order of magnitude faster than the state-of-the-art query algorithm in practice. In addition to the thorough theoretical results, our experimental results on large real networks clearly demonstrate the effectiveness and efficiency of our algorithms.
Published: 2018

12. A Markov Chain Monte Carlo Approach for Source Detection in Networks

Author: Zhang, Le, Jin, Tianyuan, Xu, Tong, Chang, Biao, Wang, Zhefeng, Chen, Enhong, Barbosa, Simone Diniz Junqueira, Series editor, Chen, Phoebe, Series editor, Filipe, Joaquim, Series editor, Kotenko, Igor, Series editor, Sivalingam, Krishna M., Series editor, Washio, Takashi, Series editor, Yuan, Junsong, Series editor, Zhou, Lizhu, Series editor, Cheng, Xueqi, editor, Ma, Weiying, editor, Liu, Huan, editor, Shen, Huawei, editor, Feng, Shizheng, editor, and Xie, Xing, editor
Published: 2017
Full Text: View/download PDF

13. Maximizing the Effect of Information Adoption: A General Framework

Author: Jin, Tianyuan, primary, Xu, Tong, additional, Zhong, Hui, additional, Chen, Enhong, additional, Wang, Zhefeng, additional, and Liu, Qi, additional
Published: 2018
Full Text: View/download PDF

14. Unconstrained submodular maximization with modular costs

Author: Jin, Tianyuan, primary, Yang, Yu, additional, Yang, Renchi, additional, Shi, Jieming, additional, Huang, Keke, additional, and Xiao, Xiaokui, additional
Published: 2021
Full Text: View/download PDF

15. Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits

Author: Jin, Tianyuan, Tang, Jing, Xu, Pan, Huang, Keke, Xiao, Xiaokui, Gu, Quanquan, Jin, Tianyuan, Tang, Jing, Xu, Pan, Huang, Keke, Xiao, Xiaokui, and Gu, Quanquan
Abstract: In batched multi-armed bandit problems, the learner can adaptively pull arms and adjust strategy in batches. In many real applications, not only the regret but also the batch complexity need to be optimized Existing batched bandit algorithms usually assume that the time horizon T is known in advance. However, many applications involve an unpredictable stopping time. In this paper, we study the anytime batched multiarmed bandit problem. We propose an anytime algorithm that achieves the asymptotically optimal regret for exponential families of reward distributions with O(log log T . ilog(alpha)(T))(1) batches, where alpha is an element of O-T(1). Moreover, we prove that for any constant c > 0, no algorithm can achieve the asymptotically optimal regret within c log log T batches.
Published: 2021

16. Optimal Streaming Algorithms for Multi-Armed Bandits

Author: Jin, Tianyuan, Huang, Keke, Tang, Jing, Xiao, Xiaokui, Jin, Tianyuan, Huang, Keke, Tang, Jing, and Xiao, Xiaokui
Abstract: This paper studies two variants of the best arm identification (BAI) problem under the streaming model, where we have a stream of n arms with reward distributions supported on [0, 1] with unknown means. The arms in the stream are arriving one by one, and the algorithm cannot access an arm unless it is stored in a limited size memory. We first study the streaming epsilon-top-k arms identification problem, which asks for k arms whose reward means are lower than that of the k-th best arm by at most epsilon with probability at least 1 - delta. For general epsilon is an element of (0, 1), the existing solution for this problem assumes k = 1 and achieves the optimal sample complexity O(n/epsilon(2) log 1/delta) using O(log{*} (n))(1) memory and a single pass of the stream. We propose an algorithm that works for any k and achieves the optimal sample complexity O(n/epsilon(2) log k/delta) using a single-arm memory and a single pass of the stream. Second, we study the streaming BAI problem, where the objective is to identify the arm with the maximum reward mean with at least 1 - delta probability, using a single-arm memory and as few passes of the input stream as possible. We present a single-arm-memory algorithm that achieves a near instance-dependent optimal sample complexity within O(log Delta(-1)(2)) passes, where Delta(2) is the gap between the mean of the best arm and that of the second best arm.
Published: 2021

17. Realtime index-free single source SimRank processing on web-scale graphs

Author: Shi, Jieming, primary, Jin, Tianyuan, additional, Yang, Renchi, additional, Xiao, Xiaokui, additional, and Yang, Yin, additional
Published: 2020
Full Text: View/download PDF

18. Tracking Top-k Influential Users with Relative Errors

Author: Yang, Yu, primary, Wang, Zhefeng, additional, Jin, Tianyuan, additional, Pei, Jian, additional, and Chen, Enhong, additional
Published: 2019
Full Text: View/download PDF

19. Realtime top-k personalized pagerank over large graphs on GPUs

Author: Shi, Jieming, primary, Yang, Renchi, additional, Jin, Tianyuan, additional, Xiao, Xiaokui, additional, and Yang, Yin, additional
Published: 2019
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

19 results on '"Jin, Tianyuan"'

1. Best Arm Identification with Minimal Regret

2. Optimal Batched Linear Bandits

3. Sparsity-Agnostic Linear Bandits with Adaptive Adversaries

4. Multi-Armed Bandits with Abstention

5. Finite-Time Frequentist Regret Bounds of Multi-Agent Thompson Sampling on Sparse Hypergraphs

6. Optimal Batched Best Arm Identification

7. Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

8. MOTS: Minimax Optimal Thompson Sampling

9. Double Explore-then-Commit: Asymptotic Optimality and Beyond

10. Realtime Index-Free Single Source SimRank Processing on Web-Scale Graphs

11. Tracking Top-K Influential Vertices in Dynamic Networks

12. A Markov Chain Monte Carlo Approach for Source Detection in Networks

13. Maximizing the Effect of Information Adoption: A General Framework

14. Unconstrained submodular maximization with modular costs

15. Almost Optimal Anytime Algorithm for Batched Multi-Armed Bandits

16. Optimal Streaming Algorithms for Multi-Armed Bandits

17. Realtime index-free single source SimRank processing on web-scale graphs

18. Tracking Top-k Influential Users with Relative Errors

19. Realtime top-k personalized pagerank over large graphs on GPUs

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

19 results on '"Jin, Tianyuan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources