Author: "Lai, Lifeng" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Lai, Lifeng"' showing total 532 results

Start Over Author "Lai, Lifeng"

532 results on '"Lai, Lifeng"'

1. Provable In-context Learning for Mixture of Linear Regressions using Transformers

Author: Jin, Yanhao, Balasubramanian, Krishnakumar, and Lai, Lifeng
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We theoretically investigate the in-context learning capabilities of transformers in the context of learning mixtures of linear regression models. For the case of two mixtures, we demonstrate the existence of transformers that can achieve an accuracy, relative to the oracle predictor, of order $\mathcal{\tilde{O}}((d/n)^{1/4})$ in the low signal-to-noise ratio (SNR) regime and $\mathcal{\tilde{O}}(\sqrt{d/n})$ in the high SNR regime, where $n$ is the length of the prompt, and $d$ is the dimension of the problem. Additionally, we derive in-context excess risk bounds of order $\mathcal{O}(L/\sqrt{B})$, where $B$ denotes the number of (training) prompts, and $L$ represents the number of attention layers. The order of $L$ depends on whether the SNR is low or high. In the high SNR regime, we extend the results to $K$-component mixture models for finite $K$. Extensive simulations also highlight the advantages of transformers for this task, outperforming other baselines such as the Expectation-Maximization algorithm.
Published: 2024

2. Transformers Handle Endogeneity in In-Context Linear Regression

Author: Liang, Haodong, Balasubramanian, Krishnakumar, and Lai, Lifeng
Subjects: Statistics - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Economics - Econometrics, Mathematics - Statistics Theory
Abstract: We explore the capability of transformers to address endogeneity in in-context linear regression. Our main finding is that transformers inherently possess a mechanism to handle endogeneity effectively using instrumental variables (IV). First, we demonstrate that the transformer architecture can emulate a gradient-based bi-level optimization procedure that converges to the widely used two-stage least squares $(\textsf{2SLS})$ solution at an exponential rate. Next, we propose an in-context pretraining scheme and provide theoretical guarantees showing that the global minimizer of the pre-training loss achieves a small excess loss. Our extensive experiments validate these theoretical findings, showing that the trained transformer provides more robust and reliable in-context predictions and coefficient estimates than the $\textsf{2SLS}$ method, in the presence of endogeneity., Comment: 30 pages
Published: 2024

3. A Huber Loss Minimization Approach to Mean Estimation under User-level Differential Privacy

Author: Zhao, Puning, Lai, Lifeng, Shen, Li, Li, Qingming, Wu, Jiafei, and Liu, Zhe
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security
Abstract: Privacy protection of users' entire contribution of samples is important in distributed systems. The most effective approach is the two-stage scheme, which finds a small interval first and then gets a refined estimate by clipping samples into the interval. However, the clipping operation induces bias, which is serious if the sample distribution is heavy-tailed. Besides, users with large local sample sizes can make the sensitivity much larger, thus the method is not suitable for imbalanced users. Motivated by these challenges, we propose a Huber loss minimization approach to mean estimation under user-level differential privacy. The connecting points of Huber loss can be adaptively adjusted to deal with imbalanced users. Moreover, it avoids the clipping operation, thus significantly reducing the bias compared with the two-stage approach. We provide a theoretical analysis of our approach, which gives the noise strength needed for privacy protection, as well as the bound of mean squared error. The result shows that the new method is much less sensitive to the imbalance of user-wise sample sizes and the tail of sample distributions. Finally, we perform numerical experiments to validate our theoretical analysis.
Published: 2024

4. Robust Risk-Sensitive Reinforcement Learning with Conditional Value-at-Risk

Author: Ni, Xinyi and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Robust Markov Decision Processes (RMDPs) have received significant research interest, offering an alternative to standard Markov Decision Processes (MDPs) that often assume fixed transition probabilities. RMDPs address this by optimizing for the worst-case scenarios within ambiguity sets. While earlier studies on RMDPs have largely centered on risk-neutral reinforcement learning (RL), with the goal of minimizing expected total discounted costs, in this paper, we analyze the robustness of CVaR-based risk-sensitive RL under RMDP. Firstly, we consider predetermined ambiguity sets. Based on the coherency of CVaR, we establish a connection between robustness and risk sensitivity, thus, techniques in risk-sensitive RL can be adopted to solve the proposed problem. Furthermore, motivated by the existence of decision-dependent uncertainty in real-world problems, we study problems with state-action-dependent ambiguity sets. To solve this, we define a new risk measure named NCVaR and build the equivalence of NCVaR optimization and robust CVaR optimization. We further propose value iteration algorithms and validate our approach in simulation experiments.
Published: 2024

5. Camouflage Adversarial Attacks on Multiple Agent Systems

Author: Lu, Ziqing, Liu, Guanlin, Lai, Lifeng, and Xu, Weiyu
Subjects: Computer Science - Multiagent Systems
Abstract: The multi-agent reinforcement learning systems (MARL) based on the Markov decision process (MDP) have emerged in many critical applications. To improve the robustness/defense of MARL systems against adversarial attacks, the study of various adversarial attacks on reinforcement learning systems is very important. Previous works on adversarial attacks considered some possible features to attack in MDP, such as the action poisoning attacks, the reward poisoning attacks, and the state perception attacks. In this paper, we propose a brand-new form of attack called the camouflage attack in the MARL systems. In the camouflage attack, the attackers change the appearances of some objects without changing the actual objects themselves; and the camouflaged appearances may look the same to all the targeted recipient (victim) agents. The camouflaged appearances can mislead the recipient agents to misguided actions. We design algorithms that give the optimal camouflage attacks minimizing the rewards of recipient agents. Our numerical and theoretical results show that camouflage attacks can rival the more conventional, but likely more difficult state perception attacks. We also investigate cost-constrained camouflage attacks and showed numerically how cost budgets affect the attack performance., Comment: arXiv admin note: text overlap with arXiv:2311.00859
Published: 2024

6. Optimal Cost Constrained Adversarial Attacks For Multiple Agent Systems

Author: Lu, Ziqing, Liu, Guanlin, Lai, Lifeng, and Xu, Weiyu
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Computer Science - Multiagent Systems
Abstract: Finding optimal adversarial attack strategies is an important topic in reinforcement learning and the Markov decision process. Previous studies usually assume one all-knowing coordinator (attacker) for whom attacking different recipient (victim) agents incurs uniform costs. However, in reality, instead of using one limitless central attacker, the attacks often need to be performed by distributed attack agents. We formulate the problem of performing optimal adversarial agent-to-agent attacks using distributed attack agents, in which we impose distinct cost constraints on each different attacker-victim pair. We propose an optimal method integrating within-step static constrained attack-resource allocation optimization and between-step dynamic programming to achieve the optimal adversarial attack in a multi-agent system. Our numerical results show that the proposed attacks can significantly reduce the rewards received by the attacked agents., Comment: Submitted to ICCASP2024
Published: 2023

7. Distributed Dual Coordinate Ascent with Imbalanced Data on a General Tree Network

Author: Cho, Myung, Lai, Lifeng, and Xu, Weiyu
Subjects: Computer Science - Machine Learning, Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Information Theory
Abstract: In this paper, we investigate the impact of imbalanced data on the convergence of distributed dual coordinate ascent in a tree network for solving an empirical loss minimization problem in distributed machine learning. To address this issue, we propose a method called delayed generalized distributed dual coordinate ascent that takes into account the information of the imbalanced data, and provide the analysis of the proposed algorithm. Numerical experiments confirm the effectiveness of our proposed method in improving the convergence speed of distributed dual coordinate ascent in a tree network., Comment: To be published in IEEE 2023 Workshop on Machine Learning for Signal Processing (MLSP)
Published: 2023

8. Minimax Optimal Q Learning with Nearest Neighbors

Author: Zhao, Puning and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Analyzing the Markov decision process (MDP) with continuous state spaces is generally challenging. A recent interesting work \cite{shah2018q} solves MDP with bounded continuous state space by a nearest neighbor $Q$ learning approach, which has a sample complexity of $\tilde{O}(\frac{1}{\epsilon^{d+3}(1-\gamma)^{d+7}})$ for $\epsilon$-accurate $Q$ function estimation with discount factor $\gamma$. In this paper, we propose two new nearest neighbor $Q$ learning methods, one for the offline setting and the other for the online setting. We show that the sample complexities of these two methods are $\tilde{O}(\frac{1}{\epsilon^{d+2}(1-\gamma)^{d+2}})$ and $\tilde{O}(\frac{1}{\epsilon^{d+2}(1-\gamma)^{d+3}})$ for offline and online methods respectively, which significantly improve over existing results and have minimax optimal dependence over $\epsilon$. We achieve such improvement by utilizing the samples more efficiently. In particular, the method in \cite{shah2018q} clears up all samples after each iteration, thus these samples are somewhat wasted. On the other hand, our offline method does not remove any samples, and our online method only removes samples with time earlier than $\beta t$ at time $t$ with $\beta$ being a tunable parameter, thus our methods significantly reduce the loss of information. Apart from the sample complexity, our methods also have additional advantages of better computational complexity, as well as suitability to unbounded state spaces.
Published: 2023

9. Efficient Adversarial Attacks on Online Multi-agent Reinforcement Learning

Author: Liu, Guanlin and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Mathematics - Optimization and Control
Abstract: Due to the broad range of applications of multi-agent reinforcement learning (MARL), understanding the effects of adversarial attacks against MARL model is essential for the safe applications of this model. Motivated by this, we investigate the impact of adversarial attacks on MARL. In the considered setup, there is an exogenous attacker who is able to modify the rewards before the agents receive them or manipulate the actions before the environment receives them. The attacker aims to guide each agent into a target policy or maximize the cumulative rewards under some specific reward function chosen by the attacker, while minimizing the amount of manipulation on feedback and action. We first show the limitations of the action poisoning only attacks and the reward poisoning only attacks. We then introduce a mixed attack strategy with both the action poisoning and the reward poisoning. We show that the mixed attack strategy can efficiently attack MARL agents even if the attacker has no prior information about the underlying environment and the agents' algorithms.
Published: 2023

10. Efficient Action Robust Reinforcement Learning with Probabilistic Policy Execution Uncertainty

Author: Liu, Guanlin, Zhou, Zhihan, Liu, Han, and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Mathematics - Optimization and Control
Abstract: Robust reinforcement learning (RL) aims to find a policy that optimizes the worst-case performance in the face of uncertainties. In this paper, we focus on action robust RL with the probabilistic policy execution uncertainty, in which, instead of always carrying out the action specified by the policy, the agent will take the action specified by the policy with probability $1-\rho$ and an alternative adversarial action with probability $\rho$. We establish the existence of an optimal policy on the action robust MDPs with probabilistic policy execution uncertainty and provide the action robust Bellman optimality equation for its solution. Furthermore, we develop Action Robust Reinforcement Learning with Certificates (ARRLC) algorithm that achieves minimax optimal regret and sample complexity. Furthermore, we conduct numerical experiments to validate our approach's robustness, demonstrating that ARRLC outperforms non-robust RL algorithms and converges faster than the robust TD algorithm in the presence of action perturbations.
Published: 2023

11. Fairness-aware Regression Robust to Adversarial Attacks

Author: Jin, Yulu and Lai, Lifeng
Subjects: Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: In this paper, we take a first step towards answering the question of how to design fair machine learning algorithms that are robust to adversarial attacks. Using a minimax framework, we aim to design an adversarially robust fair regression model that achieves optimal performance in the presence of an attacker who is able to add a carefully designed adversarial data point to the dataset or perform a rank-one attack on the dataset. By solving the proposed nonsmooth nonconvex-nonconcave minimax problem, the optimal adversary as well as the robust fairness-aware regression model are obtained. For both synthetic data and real-world datasets, numerical results illustrate that the proposed adversarially robust fair models have better performance on poisoned datasets than other fair machine learning models in both prediction accuracy and group-based fairness measure.
Published: 2022

12. Efficiently Escaping Saddle Points in Bilevel Optimization

Author: Huang, Minhui, Chen, Xuxing, Ji, Kaiyi, Ma, Shiqian, and Lai, Lifeng
Subjects: Computer Science - Machine Learning
Abstract: Bilevel optimization is one of the fundamental problems in machine learning and optimization. Recent theoretical developments in bilevel optimization focus on finding the first-order stationary points for nonconvex-strongly-convex cases. In this paper, we analyze algorithms that can escape saddle points in nonconvex-strongly-convex bilevel optimization. Specifically, we show that the perturbed approximate implicit differentiation (AID) with a warm start strategy finds $\epsilon$-approximate local minimum of bilevel optimization in $\tilde{O}(\epsilon^{-2})$ iterations with high probability. Moreover, we propose an inexact NEgative-curvature-Originated-from-Noise Algorithm (iNEON), a pure first-order algorithm that can escape saddle point and find local minimum of stochastic bilevel optimization. As a by-product, we provide the first nonasymptotic analysis of perturbed multi-step gradient descent ascent (GDmax) algorithm that converges to local minimax point for minimax problems.
Published: 2022

13. Efficient Action Poisoning Attacks on Linear Contextual Bandits

Author: Liu, Guanlin and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Contextual bandit algorithms have many applicants in a variety of scenarios. In order to develop trustworthy contextual bandit systems, understanding the impacts of various adversarial attacks on contextual bandit algorithms is essential. In this paper, we propose a new class of attacks: action poisoning attacks, where an adversary can change the action signal selected by the agent. We design action poisoning attack schemes against linear contextual bandit algorithms in both white-box and black-box settings. We further analyze the cost of the proposed attack strategies for a very popular and widely used bandit algorithm: LinUCB. We show that, in both white-box and black-box settings, the proposed attack schemes can force the LinUCB agent to pull a target arm very frequently by spending only logarithm cost.
Published: 2021

14. Provably Efficient Black-Box Action Poisoning Attacks Against Reinforcement Learning

Author: Liu, Guanlin and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Computer Science - Cryptography and Security, Mathematics - Optimization and Control
Abstract: Due to the broad range of applications of reinforcement learning (RL), understanding the effects of adversarial attacks against RL model is essential for the safe applications of this model. Prior theoretical works on adversarial attacks against RL mainly focus on either observation poisoning attacks or environment poisoning attacks. In this paper, we introduce a new class of attacks named action poisoning attacks, where an adversary can change the action signal selected by the agent. Compared with existing attack models, the attacker's ability in the proposed action poisoning attack model is more restricted, which brings some design challenges. We study the action poisoning attack in both white-box and black-box settings. We introduce an adaptive attack scheme called LCB-H, which works for most RL agents in the black-box setting. We prove that the LCB-H attack can force any efficient RL agent, whose dynamic regret scales sublinearly with the total number of steps taken, to choose actions according to a policy selected by the attacker very frequently, with only sublinear cost. In addition, we apply LCB-H attack against a popular model-free RL algorithm: UCB-H. We show that, even in the black-box setting, by spending only logarithm cost, the proposed LCB-H attack scheme can force the UCB-H agent to choose actions according to the policy selected by the attacker very frequently.
Published: 2021

15. On the Convergence of Projected Alternating Maximization for Equitable and Optimal Transport

Author: Huang, Minhui, Ma, Shiqian, and Lai, Lifeng
Subjects: Mathematics - Optimization and Control, Computer Science - Machine Learning
Abstract: This paper studies the equitable and optimal transport (EOT) problem, which has many applications such as fair division problems and optimal transport with multiple agents etc. In the discrete distributions case, the EOT problem can be formulated as a linear program (LP). Since this LP is prohibitively large for general LP solvers, Scetbon \etal \cite{scetbon2021equitable} suggests to perturb the problem by adding an entropy regularization. They proposed a projected alternating maximization algorithm (PAM) to solve the dual of the entropy regularized EOT. In this paper, we provide the first convergence analysis of PAM. A novel rounding procedure is proposed to help construct the primal solution for the original EOT problem. We also propose a variant of PAM by incorporating the extrapolation technique that can numerically improve the performance of PAM. Results in this paper may shed lights on block coordinate (gradient) descent methods for general optimization problems.
Published: 2021

16. Optimal Accuracy-Privacy Trade-Off of Inference as Service

Author: Jin, Yulu and Lai, Lifeng
Subjects: Information and Computing Sciences, Cybersecurity and Privacy, Bioengineering, Privacy, Convergence, Optimization, Servers, Inference algorithms, Data privacy, Signal processing algorithms, ADMM, inference, privacy, Networking & Telecommunications
Abstract: In this paper, we propose a general framework to provide a desirable trade-off between inference accuracy and privacy protection in the inference as service scenario (IAS). Instead of sending data directly to the server, the user will preprocess the data through a privacy-preserving mapping, which will increase privacy protection but reduce inference accuracy. To properly address the trade-off between privacy protection and inference accuracy, we formulate an optimization problem to find the privacy-preserving mapping. Even though the problem is non-convex in general, we characterize nice structures of the problem and develop an iterative algorithm to find the desired privacy-preserving mapping, with convergence analysis provided under certain assumptions. From numerical examples, we observe that the proposed method has better performance than gradient ascent method in the convergence speed, solution quality and algorithm stability.
Published: 2022

17. Optimal Stochastic Nonconvex Optimization with Bandit Feedback

Author: Zhao, Puning and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In this paper, we analyze the continuous armed bandit problems for nonconvex cost functions under certain smoothness and sublevel set assumptions. We first derive an upper bound on the expected cumulative regret of a simple bin splitting method. We then propose an adaptive bin splitting method, which can significantly improve the performance. Furthermore, a minimax lower bound is derived, which shows that our new adaptive method achieves locally minimax optimal expected cumulative regret.
Published: 2021

18. Projection Robust Wasserstein Barycenters

Author: Huang, Minhui, Ma, Shiqian, and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: Collecting and aggregating information from several probability measures or histograms is a fundamental task in machine learning. One of the popular solution methods for this task is to compute the barycenter of the probability measures under the Wasserstein metric. However, approximating the Wasserstein barycenter is numerically challenging because of the curse of dimensionality. This paper proposes the projection robust Wasserstein barycenter (PRWB) that has the potential to mitigate the curse of dimensionality. Since PRWB is numerically very challenging to solve, we further propose a relaxed PRWB (RPRWB) model, which is more tractable. The RPRWB projects the probability measures onto a lower-dimensional subspace that maximizes the Wasserstein barycenter objective. The resulting problem is a max-min problem over the Stiefel manifold. By combining the iterative Bregman projection algorithm and Riemannian optimization, we propose two new algorithms for computing the RPRWB. The complexity of arithmetic operations of the proposed algorithms for obtaining an $\epsilon$-stationary solution is analyzed. We incorporate the RPRWB into a discrete distribution clustering algorithm, and the numerical results on real text datasets confirm that our RPRWB model helps improve the clustering performance significantly.
Published: 2021

19. Resource curse, public crisis, and the road to sustainable development in emerging Asia

Author: Lai, Lifeng and Li, Xin
Published: 2024
Full Text: View/download PDF

20. A Riemannian Block Coordinate Descent Method for Computing the Projection Robust Wasserstein Distance

Author: Huang, Minhui, Ma, Shiqian, and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: The Wasserstein distance has become increasingly important in machine learning and deep learning. Despite its popularity, the Wasserstein distance is hard to approximate because of the curse of dimensionality. A recently proposed approach to alleviate the curse of dimensionality is to project the sampled data from the high dimensional probability distribution onto a lower-dimensional subspace, and then compute the Wasserstein distance between the projected data. However, this approach requires to solve a max-min problem over the Stiefel manifold, which is very challenging in practice. The only existing work that solves this problem directly is the RGAS (Riemannian Gradient Ascent with Sinkhorn Iteration) algorithm, which requires to solve an entropy-regularized optimal transport problem in each iteration, and thus can be costly for large-scale problems. In this paper, we propose a Riemannian block coordinate descent (RBCD) method to solve this problem, which is based on a novel reformulation of the regularized max-min problem over the Stiefel manifold. We show that the complexity of arithmetic operations for RBCD to obtain an $\epsilon$-stationary point is $O(\epsilon^{-3})$. This significantly improves the corresponding complexity of RGAS, which is $O(\epsilon^{-12})$. Moreover, our RBCD has very low per-iteration complexity, and hence is suitable for large-scale problems. Numerical results on both synthetic and real datasets demonstrate that our method is more efficient than existing methods, especially when the number of sampled data is very large.
Published: 2020

21. On the Adversarial Robustness of LASSO Based Feature Selection

Author: Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Subjects: Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: In this paper, we investigate the adversarial robustness of feature selection based on the $\ell_1$ regularized linear regression model, namely LASSO. In the considered model, there is a malicious adversary who can observe the whole dataset, and then will carefully modify the response values or the feature matrix in order to manipulate the selected features. We formulate the modification strategy of the adversary as a bi-level optimization problem. Due to the difficulty of the non-differentiability of the $\ell_1$ norm at the zero point, we reformulate the $\ell_1$ norm regularizer as linear inequality constraints. We employ the interior-point method to solve this reformulated LASSO problem and obtain the gradient information. Then we use the projected gradient descent method to design the modification strategy. In addition, We demonstrate that this method can be extended to other $\ell_1$ based feature selection methods, such as group LASSO and sparse group LASSO. Numerical examples with synthetic and real data illustrate that our method is efficient and effective.
Published: 2020
Full Text: View/download PDF

22. Analysis of KNN Density Estimation

Author: Zhao, Puning and Lai, Lifeng
Subjects: Statistics - Machine Learning, Computer Science - Machine Learning
Abstract: We analyze the $\ell_1$ and $\ell_\infty$ convergence rates of k nearest neighbor density estimation method. Our analysis includes two different cases depending on whether the support set is bounded or not. In the first case, the probability density function has a bounded support and is bounded away from zero. We show that kNN density estimation is minimax optimal under both $\ell_1$ and $\ell_\infty$ criteria, if the support set is known. If the support set is unknown, then the convergence rate of $\ell_1$ error is not affected, while $\ell_\infty$ error does not converge. In the second case, the probability density function can approach zero and is smooth everywhere. Moreover, the Hessian is assumed to decay with the density values. For this case, our result shows that the $\ell_\infty$ error of kNN density estimation is nearly minimax optimal. The $\ell_1$ error does not reach the minimax lower bound, but is better than kernel density estimation.
Published: 2020

23. Robust Low-rank Matrix Completion via an Alternating Manifold Proximal Gradient Continuation Method

Author: Huang, Minhui, Ma, Shiqian, and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Robust low-rank matrix completion (RMC), or robust principal component analysis with partially observed data, has been studied extensively for computer vision, signal processing and machine learning applications. This problem aims to decompose a partially observed matrix into the superposition of a low-rank matrix and a sparse matrix, where the sparse matrix captures the grossly corrupted entries of the matrix. A widely used approach to tackle RMC is to consider a convex formulation, which minimizes the nuclear norm of the low-rank matrix (to promote low-rankness) and the l1 norm of the sparse matrix (to promote sparsity). In this paper, motivated by some recent works on low-rank matrix completion and Riemannian optimization, we formulate this problem as a nonsmooth Riemannian optimization problem over Grassmann manifold. This new formulation is scalable because the low-rank matrix is factorized to the multiplication of two much smaller matrices. We then propose an alternating manifold proximal gradient continuation (AManPGC) method to solve the proposed new formulation. The convergence rate of the proposed algorithm is rigorously analyzed. Numerical results on both synthetic data and real data on background extraction from surveillance videos are reported to demonstrate the advantages of the proposed new formulation and algorithm over several popular existing approaches.
Published: 2020
Full Text: View/download PDF

24. Optimal Feature Manipulation Attacks Against Linear Regression

Author: Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Cryptography and Security, Computer Science - Machine Learning
Abstract: In this paper, we investigate how to manipulate the coefficients obtained via linear regression by adding carefully designed poisoning data points to the dataset or modify the original data points. Given the energy budget, we first provide the closed-form solution of the optimal poisoning data point when our target is modifying one designated regression coefficient. We then extend the analysis to the more challenging scenario where the attacker aims to change one particular regression coefficient while making others to be changed as small as possible. For this scenario, we introduce a semidefinite relaxation method to design the best attack scheme. Finally, we study a more powerful adversary who can perform a rank-one modification on the feature matrix. We propose an alternating optimization method to find the optimal rank-one modification matrix. Numerical examples are provided to illustrate the analytical results obtained in this paper.
Published: 2020
Full Text: View/download PDF

25. Minimax Optimal Estimation of KL Divergence for Continuous Distributions

Author: Zhao, Puning and Lai, Lifeng
Subjects: Computer Science - Information Theory, Statistics - Machine Learning
Abstract: Estimating Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains. One simple and effective estimator is based on the k nearest neighbor distances between these samples. In this paper, we analyze the convergence rates of the bias and variance of this estimator. Furthermore, we derive a lower bound of the minimax mean square error and show that kNN method is asymptotically rate optimal.
Published: 2020

26. Action-Manipulation Attacks Against Stochastic Bandits: Attacks and Defense

Author: Liu, Guanlin and lai, Lifeng
Subjects: Computer Science - Machine Learning, Computer Science - Cryptography and Security, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Due to the broad range of applications of stochastic multi-armed bandit model, understanding the effects of adversarial attacks and designing bandit algorithms robust to attacks are essential for the safe applications of this model. In this paper, we introduce a new class of attack named action-manipulation attack. In this attack, an adversary can change the action signal selected by the user. We show that without knowledge of mean rewards of arms, our proposed attack can manipulate Upper Confidence Bound (UCB) algorithm, a widely used bandit algorithm, into pulling a target arm very frequently by spending only logarithmic cost. To defend against this class of attacks, we introduce a novel algorithm that is robust to action-manipulation attacks when an upper bound for the total attack cost is given. We prove that our algorithm has a pseudo-regret upper bounded by $\mathcal{O}(\max\{\log T,A\})$, where $T$ is the total number of rounds and $A$ is the upper bound of the total attack cost., Comment: 13 pages, 7 figures, submitted to IEEE Transaction on Signal Processing
Published: 2020
Full Text: View/download PDF

27. On the Adversarial Robustness of LASSO Based Feature Selection

Author: Li, Fuwei, Lai, Lifeng, Cui, Shuguang, Shen, Xuemin Sherman, Series Editor, Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Published: 2022
Full Text: View/download PDF

28. On the Adversarial Robustness of Subspace Learning

Author: Li, Fuwei, Lai, Lifeng, Cui, Shuguang, Shen, Xuemin Sherman, Series Editor, Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Published: 2022
Full Text: View/download PDF

29. Optimal Feature Manipulation Attacks Against Linear Regression

Author: Li, Fuwei, Lai, Lifeng, Cui, Shuguang, Shen, Xuemin Sherman, Series Editor, Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Published: 2022
Full Text: View/download PDF

30. Introduction

Author: Li, Fuwei, Lai, Lifeng, Cui, Shuguang, Shen, Xuemin Sherman, Series Editor, Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Published: 2022
Full Text: View/download PDF

31. On the Adversarial Robustness of Hypothesis Testing

Author: Jin, Yulu and Lai, Lifeng
Subjects: Error probability, Testing, Robustness, Inference algorithms, Neural networks, Measurement, Random variables, Minimax problem, hypothesis testing, adversarial robustness, Networking & Telecommunications
Abstract: In this paper, we investigate the adversarial robustness of hypothesis testing rules. In the considered model, after a sample is generated, it will be modified by an adversary before being observed by the decision maker. The decision maker needs to decide the underlying hypothesis that generates the sample from the adversarially-modified data. We formulate this problem as a minimax hypothesis testing problem, in which the goal of the adversary is to design attack strategy to maximize the error probability while the decision maker aims to design decision rules so as to minimize the error probability. We consider both hypothesis-Aware case, in which the attacker knows the true underlying hypothesis, and hypothesis-unaware case, in which the attacker does not know the true underlying hypothesis. We solve this minimax problem and characterize the corresponding optimal strategies for both cases.
Published: 2021

32. Minimax Rate Optimal Adaptive Nearest Neighbor Classification and Regression

Author: Zhao, Puning and Lai, Lifeng
Subjects: Mathematics - Statistics Theory, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: k Nearest Neighbor (kNN) method is a simple and popular statistical method for classification and regression. For both classification and regression problems, existing works have shown that, if the distribution of the feature vector has bounded support and the probability density function is bounded away from zero in its support, the convergence rate of the standard kNN method, in which k is the same for all test samples, is minimax optimal. On the contrary, if the distribution has unbounded support, we show that there is a gap between the convergence rate achieved by the standard kNN method and the minimax bound. To close this gap, we propose an adaptive kNN method, in which different k is selected for different samples. Our selection rule does not require precise knowledge of the underlying distribution of features. The new proposed method significantly outperforms the standard one. We characterize the convergence rate of the proposed adaptive method, and show that it matches the minimax lower bound.
Published: 2019

33. On the Adversarial Robustness of Subspace Learning

Author: Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Cryptography and Security, Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In this paper, we study the adversarial robustness of subspace learning problems. Different from the assumptions made in existing work on robust subspace learning where data samples are contaminated by gross sparse outliers or small dense noises, we consider a more powerful adversary who can first observe the data matrix and then intentionally modify the whole data matrix. We first characterize the optimal rank-one attack strategy that maximizes the subspace distance between the subspace learned from the original data matrix and that learned from the modified data matrix. We then generalize the study to the scenario without the rank constraint and characterize the corresponding optimal attack strategy. Our analysis shows that the optimal strategies depend on the singular values of the original data matrix and the adversary's energy budget. Finally, we provide numerical experiments and practical applications to demonstrate the efficiency of the attack strategies.
Published: 2019
Full Text: View/download PDF

34. On the Adversarial Robustness of Multivariate Robust Estimation

Author: Bayraktar, Erhan and Lai, Lifeng
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning, Mathematics - Statistics Theory
Abstract: In this paper, we investigate the adversarial robustness of multivariate $M$-Estimators. In the considered model, after observing the whole dataset, an adversary can modify all data points with the goal of maximizing inference errors. We use adversarial influence function (AIF) to measure the asymptotic rate at which the adversary can change the inference result. We first characterize the adversary's optimal modification strategy and its corresponding AIF. From the defender's perspective, we would like to design an estimator that has a small AIF. For the case of joint location and scale estimation problem, we characterize the optimal $M$-estimator that has the smallest AIF. We further identify a tradeoff between robustness against adversarial modifications and robustness against outliers, and derive the optimal $M$-estimator that achieves the best tradeoff.
Published: 2019

35. Towards Ultra-Reliable Low-Latency Communications: Typical Scenarios, Possible Solutions, and Open Issues

Author: Feng, Daquan, She, Changyang, Ying, Kai, Lai, Lifeng, Hou, Zhanwei, Quek, Tony Q. S., Li, Yonghui, and Vucetic, Branka
Subjects: Electrical Engineering and Systems Science - Signal Processing, Computer Science - Information Theory
Abstract: Ultra-reliable low-latency communications (URLLC) has been considered as one of the three new application scenarios in the \emph{5th Generation} (5G) \emph {New Radio} (NR), where the physical layer design aspects have been specified. With the 5G NR, we can guarantee the reliability and latency in radio access networks. However, for communication scenarios where the transmission involves both radio access and wide area core networks, the delay in radio access networks only contributes to part of the \emph{end-to-end} (E2E) delay. In this paper, we outline the delay components and packet loss probabilities in typical communication scenarios of URLLC, and formulate the constraints on E2E delay and overall packet loss probability. Then, we summarize possible solutions in the physical layer, the link layer, the network layer, and the cross-layer design, respectively. Finally, we discuss the open issues in prediction and communication co-design for URLLC in wide area large scale networks., Comment: 8 pages, 7 figures. Accepted by IEEE Vehicular Technology Magazine
Published: 2019

36. Summary and Extensions

Author: Li, Fuwei, Lai, Lifeng, Cui, Shuguang, Shen, Xuemin Sherman, Series Editor, Li, Fuwei, Lai, Lifeng, and Cui, Shuguang
Published: 2022
Full Text: View/download PDF

37. How did Donald Trump Surprisingly Win the 2016 United States Presidential Election? an Information-Theoretic Perspective (Clean Sensing for Big Data Analytics:Optimal Strategies,Estimation Error Bounds Tighter than the Cram\'{e}r-Rao Bound)

Author: Xu, Weiyu, Lai, Lifeng, and Khajehnejad, Amin
Subjects: Computer Science - Information Theory, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Signal Processing, Mathematics - Optimization and Control, Statistics - Machine Learning
Abstract: Donald Trump was lagging behind in nearly all opinion polls leading up to the 2016 US presidential election, but he surprisingly won the election. This raises the following important questions: 1) why most opinion polls were not accurate in 2016? and 2) how to improve the accuracies of opinion polls? In this paper, we study the inaccuracies of opinion polls in the 2016 election through the lens of information theory. We first propose a general framework of parameter estimation, called clean sensing (polling), which performs optimal parameter estimation with sensing cost constraints, from heterogeneous and potentially distorted data sources. We then cast the opinion polling as a problem of parameter estimation from potentially distorted heterogeneous data sources, and derive the optimal polling strategy using heterogenous and possibly distorted data under cost constraints. Our results show that a larger number of data samples do not necessarily lead to better polling accuracy, which give a possible explanation of the inaccuracies of opinion polls in 2016. The optimal sensing strategy should instead optimally allocate sensing resources over heterogenous data sources according to several factors including data quality, and, moreover, for a particular data source, it should strike an optimal balance between the quality of data samples, and the quantity of data samples. As a byproduct of this research, in a general setting, we derive a group of new lower bounds on the mean-squared errors of general unbiased and biased parameter estimators. These new lower bounds can be tighter than the classical Cram\'{e}r-Rao bound (CRB) and Chapman-Robbins bound. Our derivations are via studying the Lagrange dual problems of certain convex programs. The classical Cram\'{e}r-Rao bound and Chapman-Robbins bound follow naturally from our results for special cases of these convex programs., Comment: 45 pages
Published: 2018

38. Quick Best Action Identification in Linear Bandit Problems

Author: Geng, Jun and Lai, Lifeng
Subjects: Computer Science - Machine Learning, Statistics - Machine Learning
Abstract: In this paper, we consider a best action identification problem in the stochastic linear bandit setup with a fixed confident constraint. In the considered best action identification problem, instead of minimizing the accumulative regret as done in existing works, the learner aims to obtain an accurate estimate of the underlying parameter based on his action and reward sequences. To improve the estimation efficiency, the learner is allowed to select his action based his historical information; hence the whole procedure is designed in a sequential adaptive manner. We first show that the existing algorithms designed to minimize the accumulative regret is not a consistent estimator and hence is not a good policy for our problem. We then characterize a lower bound on the estimation error for any policy. We further design a simple policy and show that the estimation error of the designed policy achieves the same scaling order as that of the derived lower bound., Comment: 8 pages, 2 figures. Submitted to Asilomar 2018
Published: 2018

39. Analysis of KNN Information Estimators for Smooth Distributions

Author: Zhao, Puning and Lai, Lifeng
Subjects: Statistics - Machine Learning, Computer Science - Information Theory, Computer Science - Machine Learning, Mathematics - Statistics Theory
Abstract: KSG mutual information estimator, which is based on the distances of each sample to its k-th nearest neighbor, is widely used to estimate mutual information between two continuous random variables. Existing work has analyzed the convergence rate of this estimator for random variables whose densities are bounded away from zero in its support. In practice, however, KSG estimator also performs well for a much broader class of distributions, including not only those with bounded support and densities bounded away from zero, but also those with bounded support but densities approaching zero, and those with unbounded support. In this paper, we analyze the convergence rate of the error of KSG estimator for smooth distributions, whose support of density can be both bounded and unbounded. As KSG mutual information estimator can be viewed as an adaptive recombination of KL entropy estimators, in our analysis, we also provide convergence analysis of KL entropy estimator for a broad class of distributions.
Published: 2018

40. On the adversarial robustness of robust estimators

Author: Lai, Lifeng and Bayraktar, Erhan
Subjects: Mathematics - Statistics Theory
Abstract: Motivated by recent data analytics applications, we study the adversarial robustness of robust estimators. Instead of assuming that only a fraction of the data points are outliers as considered in the classic robust estimation setup, in this paper, we consider an adversarial setup in which an attacker can observe the whole dataset and can modify all data samples in an adversarial manner so as to maximize the estimation error caused by his attack. We characterize the attacker's optimal attack strategy, and further introduce adversarial influence function (AIF) to quantify an estimator's sensitivity to such adversarial attacks. We provide an approach to characterize AIF for any given robust estimator, and then design optimal estimator that minimizes AIF, which implies it is least sensitive to adversarial attacks and hence is most robust against adversarial attacks. From this characterization, we identify a tradeoff between AIF (i.e., robustness against adversarial attack) and influence function, a quantity used in classic robust estimators to measure robustness against outliers, and design estimators that strike a desirable tradeoff between these two quantities.
Published: 2018

41. Multi-Chart Detection Procedure for Bayesian Quickest Change-Point Detection with Unknown Post-Change Parameters

Author: Geng, Jun, Bayraktar, Erhan, and Lai, Lifeng
Subjects: Computer Science - Information Theory
Abstract: In this paper, the problem of quickly detecting an abrupt change on a stochastic process under Bayesian framework is considered. Different from the classic Bayesian quickest change-point detection problem, this paper considers the case where there is uncertainty about the post-change distribution. Specifically, the observer only knows that the post-change distribution belongs to a parametric distribution family but he does not know the true value of the post-change parameter. In this scenario, we propose two multi-chart detection procedures, termed as M-SR procedure and modified M-SR procedure respectively, and show that these two procedures are asymptotically optimal when the post-change parameter belongs to a finite set and are asymptotically $\epsilon-$optimal when the post-change parameter belongs to a compact set with finite measure. Both algorithms can be calculated efficiently as their detection statistics can be updated recursively. We then extend the study to consider the multi-source monitoring problem with unknown post-change parameters. When those monitored sources are mutually independent, we propose a window-based modified M-SR detection procedure and show that the proposed detection method is first-order asymptotically optimal when post-change parameters belong to finite sets. We show that both computation and space complexities of the proposed algorithm increase only linearly with respect to the number of sources., Comment: 30 pages, 5 figures
Published: 2017

42. Distributed Dual Coordinate Ascent in General Tree Networks and Communication Network Effect on Synchronous Machine Learning

Author: Cho, Myung, Lai, Lifeng, and Xu, Weiyu
Subjects: Computer Science - Distributed, Parallel, and Cluster Computing, Computer Science - Information Theory, Computer Science - Machine Learning
Abstract: Due to the big size of data and limited data storage volume of a single computer or a single server, data are often stored in a distributed manner. Thus, performing large-scale machine learning operations with the distributed datasets through communication networks is often required. In this paper, we study the convergence rate of the distributed dual coordinate ascent for distributed machine learning problems in a general tree-structured network. Since a tree network model can be understood as the generalization of a star network model, our algorithm can be thought of as the generalization of the distributed dual coordinate ascent in a star network model. We provide the convergence rate of the distributed dual coordinate ascent over a general tree network in a recursive manner and analyze the network effect on the convergence rate. Secondly, by considering network communication delays, we optimize the distributed dual coordinate ascent algorithm to maximize its convergence speed. From our analytical result, we can choose the optimal number of local iterations depending on the communication delay severity to achieve the fastest convergence speed. In numerical experiments, we consider machine learning scenarios over communication networks, where local workers cannot directly reach to a central node due to constraints in communication, and demonstrate that the usability of our distributed dual coordinate ascent algorithm in tree networks. Additionally, we show that adapting number of local and global iterations to network communication delays in the distributed dual coordinated ascent algorithm can improve its convergence speed., Comment: 34 pages, 18 figures
Published: 2017

43. Tree Network Design for Faster Distributed Machine Learning Process with Distributed Dual Coordinate Ascent

Author: Cho, Myung, primary, Chikkam, Meghana, additional, Xu, Weiyu, additional, and Lai, Lifeng, additional
Published: 2024
Full Text: View/download PDF

44. Compressed Hypothesis Testing: To Mix or Not to Mix?

Author: Cho, Myung, Xu, Weiyu, and Lai, Lifeng
Subjects: Computer Science - Information Theory
Abstract: In this paper, we study the problem of determining $k$ anomalous random variables that have different probability distributions from the rest $(n-k)$ random variables. Instead of sampling each individual random variable separately as in the conventional hypothesis testing, we propose to perform hypothesis testing using mixed observations that are functions of multiple random variables. We characterize the error exponents for correctly identifying the $k$ anomalous random variables under fixed time-invariant mixed observations, random time-varying mixed observations, and deterministic time-varying mixed observations. For our error exponent characterization, we introduce the notions of inner conditional Chernoff information and outer conditional Chernoff information. It is demonstrated that mixed observations can strictly improve the error exponents of hypothesis testing, over separate observations of individual random variables. We further characterize the optimal sensing vector maximizing the error exponents, which leads to explicit constructions of the optimal mixed observations in special cases of hypothesis testing for Gaussian random variables. These results show that mixed observations of random variables can reduce the number of required samples in hypothesis testing applications. In order to solve large-scale hypothesis testing problems, we also propose efficient algorithms - LASSO based and message passing based hypothesis testing algorithms., Comment: compressed sensing, hypothesis testing, Chernoff information, anomaly detection, anomalous random variable, quickest detection. arXiv admin note: substantial text overlap with arXiv:1208.2311
Published: 2016

45. Degraded Broadcast Channel with Secrecy Outside a Bounded Range

Author: Zou, Shaofeng, Liang, Yingbin, Lai, Lifeng, Poor, H. Vincent, and Shamai, Shlomo
Subjects: Computer Science - Information Theory
Abstract: The $K$-receiver degraded broadcast channel with secrecy outside a bounded range is studied, in which a transmitter sends $K$ messages to $K$ receivers, and the channel quality gradually degrades from receiver $K$ to receiver 1. Each receiver $k$ is required to decode message $W_1,\ldots,W_k$, for $1\leq k\leq K$, and to be kept ignorant of $W_{k+2},\ldots,W_K$, for $k=1,\ldots, K-2$. Thus, each message $W_k$ is kept secure from receivers with at least two-level worse channel quality, i.e., receivers 1, $\ldots$, $k-2$. The secrecy capacity region is fully characterized. The achievable scheme designates one superposition layer to each message with binning employed for each layer. Joint embedded coding and binning are employed to protect all upper-layer messages from lower-layer receivers. Furthermore, the scheme allows adjacent layers to share rates so that part of the rate of each message can be shared with its immediate upper-layer message to enlarge the rate region. More importantly, an induction approach is developed to perform Fourier-Motzkin elimination of $2K$ variables from the order of $K^2$ bounds to obtain a close-form achievable rate region. An outer bound is developed that matches the achievable rate region, whose proof involves recursive construction of the rate bounds and exploits the intuition gained from the achievable scheme., Comment: submitted to IEEE Transactions on Information Theory
Published: 2016
Full Text: View/download PDF

46. On Randomized Distributed Coordinate Descent with Quantized Updates

Author: Gamal, Mostafa El and Lai, Lifeng
Subjects: Statistics - Machine Learning, Computer Science - Learning
Abstract: In this paper, we study the randomized distributed coordinate descent algorithm with quantized updates. In the literature, the iteration complexity of the randomized distributed coordinate descent algorithm has been characterized under the assumption that machines can exchange updates with an infinite precision. We consider a practical scenario in which the messages exchange occurs over channels with finite capacity, and hence the updates have to be quantized. We derive sufficient conditions on the quantization error such that the algorithm with quantized update still converge. We further verify our theoretical results by running an experiment, where we apply the algorithm with quantized updates to solve a linear regression problem., Comment: Accepted at CISS 2017
Published: 2016

47. Efficient Byzantine Sequential Change Detection

Author: Fellouris, Georgios, Bayraktar, Erhan, and Lai, Lifeng
Subjects: Mathematics - Statistics Theory, Computer Science - Information Theory, Statistics - Methodology, 62L10, 60G40
Abstract: In the multisensor sequential change detection problem, a disruption occurs in an environment monitored by multiple sensors. This disruption induces a change in the observations of an unknown subset of sensors. In the Byzantine version of this problem, which is the focus of this work, it is further assumed that the postulated change-point model may be misspecified for an unknown subset of sensors. The problem then is to detect the change quickly and reliably, for any possible subset of affected sensors, even if the misspecified sensors are controlled by an adversary. Given a user-specified upper bound on the number of compromised sensors, we propose and study three families of sequential change-detection rules for this problem. These are designed and evaluated under a generalization of Lorden's criterion, where conditional expected detection delay and expected time to false alarm are both computed in the worst-case scenario for the compromised sensors. The first-order asymptotic performance of these procedures is characterized as the worst-case false alarm rate goes to 0. The insights from these theoretical results are corroborated by a simulation study., Comment: 36 pages, 4 figures
Published: 2016

48. Summary and Extensions

Author: Li, Fuwei, primary, Lai, Lifeng, additional, and Cui, Shuguang, additional
Published: 2022
Full Text: View/download PDF

49. Machine Learning Algorithms

Author: Li, Fuwei, primary, Lai, Lifeng, additional, and Cui, Shuguang, additional
Published: 2022
Full Text: View/download PDF

50. On the Adversarial Robustness of Subspace Learning

Author: Li, Fuwei, primary, Lai, Lifeng, additional, and Cui, Shuguang, additional
Published: 2022
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

532 results on '"Lai, Lifeng"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources