207 results on '"Gu, Quanquan"'
Search Results
2. Differentially private federated learning with Laplacian smoothing
- Author
-
Liang, Zhicong, Wang, Bao, Gu, Quanquan, Osher, Stanley, and Yao, Yuan
- Published
- 2024
- Full Text
- View/download PDF
3. Cholinergic relevant functional reactivity is associated with dopamine responsiveness of tremor in Parkinson’s disease
- Author
-
Wu, Jingjing, Zhou, Cheng, Guo, Tao, Guan, Xiaojun, Gao, Ting, Bai, Xueqin, Wu, Haoting, Chen, Jingwen, Wen, Jiaqi, Liu, Xiaocao, Gu, Luyan, Song, Zhe, Xuan, Min, Gu, Quanquan, Huang, Peiyu, Pu, Jiali, Zhang, Baorong, Xu, Xiaojun, and Zhang, Minming
- Published
- 2022
- Full Text
- View/download PDF
4. Symptomology following mRNA vaccination against SARS-CoV-2
- Author
-
Ebinger, Joseph E., Lan, Roy, Sun, Nancy, Wu, Min, Joung, Sandy, Botwin, Gregory J., Botting, Patrick, Al-Amili, Daniah, Aronow, Harriet, Beekley, James, Coleman, Bernice, Contreras, Sandra, Cozen, Wendy, Davis, Jennifer, Debbas, Philip, Diaz, Jacqueline, Driver, Matthew, Fert-Bober, Justyna, Gu, Quanquan, Heath, Mallory, Herrera, Ergueen, Hoang, Amy, Hussain, Shehnaz K., Huynh, Carissa, Kim, Linda, Kittleson, Michelle, Liu, Yunxian, Lloyd, John, Luong, Eric, Malladi, Bhavya, Merchant, Akil, Merin, Noah, Mujukian, Angela, Nguyen, Nathalie, Nguyen, Trevor-Trung, Pozdnyakova, Valeriya, Rashid, Mohamad, Raedschelders, Koen, Reckamp, Karen L., Rhoades, Kylie, Sternbach, Sarah, Vallejo, Rocío, White, Shane, Tompkins, Rose, Wong, Melissa, Arditi, Moshe, Figueiredo, Jane C., Van Eyk, Jennifer E., Miles, Peggy B., Chavira, Cynthia, Shane, Rita, Sobhani, Kimia, Melmed, Gil Y., McGovern, Dermot P.B., Braun, Jonathan G., Cheng, Susan, and Minissian, Margo B.
- Published
- 2021
- Full Text
- View/download PDF
5. Progressive microstructural alterations in subcortical nuclei in Parkinson's disease: A diffusion magnetic resonance imaging study
- Author
-
Bai, Xueqin, Zhou, Cheng, Guo, Tao, Guan, Xiaojun, Wu, Jingjing, Liu, Xiaocao, Gao, Ting, Gu, Luyan, Xuan, Min, Gu, Quanquan, Huang, Peiyu, Song, Zhe, Yan, Yaping, Pu, Jiali, Zhang, Baorong, Xu, Xiaojun, and Zhang, Minming
- Published
- 2021
- Full Text
- View/download PDF
6. Abnormal white matter tracts of insula in smokers
- Author
-
Wang, Chao, Wang, Shuyue, Huang, Peiyu, Shen, Zhujing, Qian, Wei, Luo, Xiao, Li, Kaicheng, Zeng, Qingze, Gu, Quanquan, Yu, Hualiang, Yang, Yihong, and Zhang, Minming
- Published
- 2021
- Full Text
- View/download PDF
7. Brain structural correlates of depressive symptoms in Parkinson's disease patients at different disease stage
- Author
-
Li, Yanxuan, Huang, Peiyu, Guo, Tao, Guan, Xiaojun, Gao, Ting, Sheng, Wenshuang, Zhou, Cheng, Wu, Jingjing, Song, Zhe, Xuan, Min, Gu, Quanquan, Xu, Xiaojun, Yang, Yunjun, and Zhang, Minming
- Published
- 2020
- Full Text
- View/download PDF
8. Increased interregional functional connectivity of anterior insula is associated with improved smoking cessation outcome
- Author
-
Wang, Chao, Shen, Zhujing, Huang, Peiyu, Qian, Wei, Zhou, Cheng, Li, Kaicheng, Zeng, Qingze, Luo, Xiao, Gu, Quanquan, Yu, Hualiang, Yang, Yihong, and Zhang, Minming
- Published
- 2020
- Full Text
- View/download PDF
9. Gradient descent optimizes over-parameterized deep ReLU networks
- Author
-
Zou, Difan, Cao, Yuan, Zhou, Dongruo, and Gu, Quanquan
- Published
- 2020
- Full Text
- View/download PDF
10. Iron-related nigral degeneration influences functional topology mediated by striatal dysfunction in Parkinson's disease
- Author
-
Guan, Xiaojun, Zhang, Yuyao, Wei, Hongjiang, Guo, Tao, Zeng, Qiaoling, Zhou, Cheng, Wang, Jiaqiu, Gao, Ting, Xuan, Min, Gu, Quanquan, Xu, Xiaojun, Huang, Peiyu, Pu, Jiali, Zhang, Baorong, Liu, Chunlei, and Zhang, Minming
- Published
- 2019
- Full Text
- View/download PDF
11. Alteration of Brain Functional Connectivity in Parkinson’s Disease Patients with Dysphagia
- Author
-
Gao, Jixiang, Guan, Xiaojun, Cen, Zhidong, Chen, You, Ding, Xueping, Lou, Yuting, Wu, Sheng, Wang, Bo, Ouyang, Zhiyuan, Xuan, Min, Gu, Quanquan, Xu, Xiaojun, Huang, Peiyu, Zhang, Minming, and Luo, Wei
- Published
- 2019
- Full Text
- View/download PDF
12. Quantitative susceptibility mapping as a biomarker for evaluating white matter alterations in Parkinson’s disease
- Author
-
Guan, Xiaojun, Huang, Peiyu, Zeng, Qiaoling, Liu, Chunlei, Wei, Hongjiang, Xuan, Min, Gu, Quanquan, Xu, Xiaojun, Wang, Nian, Yu, Xinfeng, Luo, Xiao, and Zhang, Minming
- Published
- 2019
- Full Text
- View/download PDF
13. Different patterns of gray matter density in early- and middle-late-onset Parkinson’s disease: a voxel-based morphometry study
- Author
-
Xuan, Min, Guan, Xiaojun, Huang, Peiyu, Shen, Zhujing, Gu, Quanquan, Yu, Xinfeng, Xu, Xiaojun, Luo, Wei, and Zhang, Minming
- Published
- 2019
- Full Text
- View/download PDF
14. Different iron deposition patterns in early- and middle-late-onset Parkinson's disease
- Author
-
Xuan, Min, Guan, Xiaojun, Gu, Quanquan, Shen, Zhujing, Yu, Xinfeng, Qiu, Tiantian, Luo, Xiao, Song, Ruirui, Jiaerken, Yerfan, Xu, Xiaojun, Huang, Peiyu, Luo, Wei, and Zhang, Minming
- Published
- 2017
- Full Text
- View/download PDF
15. Oscillation-specific nodal alterations in early to middle stages Parkinson’s disease
- Author
-
Guan, Xiaojun, Guo, Tao, Zeng, Qiaoling, Wang, Jiaqiu, Zhou, Cheng, Liu, Chunlei, Wei, Hongjiang, Zhang, Yuyao, Xuan, Min, Gu, Quanquan, Xu, Xiaojun, Huang, Peiyu, Pu, Jiali, Zhang, Baorong, and Zhang, Min-Ming
- Published
- 2019
- Full Text
- View/download PDF
16. Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey
- Author
-
Ling, Chen, Zhao, Xujiang, Lu, Jiaying, Deng, Chengyuan, Zheng, Can, Wang, Junxiang, Chowdhury, Tanmoy, Li, Yun, Cui, Hejie, Zhang, Xuchao, Zhao, Tianjiao, Panalkar, Amit, Cheng, Wei, Wang, Haoyu, Liu, Yanchi, Chen, Zhengzhang, Chen, Haifeng, White, Chris, Gu, Quanquan, Pei, Jian, Yang, Carl, and Zhao, Liang
- Subjects
FOS: Computer and information sciences ,Artificial Intelligence (cs.AI) ,Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computation and Language (cs.CL) - Abstract
Large language models (LLMs) have significantly advanced the field of natural language processing (NLP), providing a highly useful, task-agnostic foundation for a wide range of applications. However, directly applying LLMs to solve sophisticated problems in specific domains meets many hurdles, caused by the heterogeneity of domain data, the sophistication of domain knowledge, the uniqueness of domain objectives, and the diversity of the constraints (e.g., various social norms, cultural conformity, religious beliefs, and ethical standards in the domain applications). Domain specification techniques are key to make large language models disruptive in many applications. Specifically, to solve these hurdles, there has been a notable increase in research and practices conducted in recent years on the domain specialization of LLMs. This emerging field of study, with its substantial potential for impact, necessitates a comprehensive and systematic review to better summarize and guide ongoing work in this area. In this article, we present a comprehensive survey on domain specification techniques for large language models, an emerging direction critical for large language model applications. First, we propose a systematic taxonomy that categorizes the LLM domain-specialization techniques based on the accessibility to LLMs and summarizes the framework for all the subcategories as well as their relations and differences to each other. Second, we present an extensive taxonomy of critical application domains that can benefit dramatically from specialized LLMs, discussing their practical significance and open challenges. Last, we offer our insights into the current research status and future trends in this area.
- Published
- 2023
17. Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs
- Author
-
Ji, Kaixuan, Zhao, Qingyue, He, Jiafan, Zhang, Weitong, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Recent studies have shown that episodic reinforcement learning (RL) is no harder than bandits when the total reward is bounded by $1$, and proved regret bounds that have a polylogarithmic dependence on the planning horizon $H$. However, it remains an open question that if such results can be carried over to adversarial RL, where the reward is adversarially chosen at each episode. In this paper, we answer this question affirmatively by proposing the first horizon-free policy search algorithm. To tackle the challenges caused by exploration and adversarially chosen reward, our algorithm employs (1) a variance-uncertainty-aware weighted least square estimator for the transition kernel; and (2) an occupancy measure-based technique for the online search of a \emph{stochastic} policy. We show that our algorithm achieves an $\tilde{O}\big((d+\log (|\mathcal{S}|^2 |\mathcal{A}|))\sqrt{K}\big)$ regret with full-information feedback, where $d$ is the dimension of a known feature mapping linearly parametrizing the unknown transition kernel of the MDP, $K$ is the number of episodes, $|\mathcal{S}|$ and $|\mathcal{A}|$ are the cardinalities of the state and action spaces. We also provide hardness results and regret lower bounds to justify the near optimality of our algorithm and the unavoidability of $\log|\mathcal{S}|$ and $\log|\mathcal{A}|$ in the regret bound., 34 pages
- Published
- 2023
18. Personalized Federated Learning under Mixture of Distributions
- Author
-
Wu, Yue, Zhang, Shuaicheng, Yu, Wenchao, Liu, Yanchi, Gu, Quanquan, Zhou, Dawei, Chen, Haifeng, and Cheng, Wei
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
The recent trend towards Personalized Federated Learning (PFL) has garnered significant attention as it allows for the training of models that are tailored to each client while maintaining data privacy. However, current PFL techniques primarily focus on modeling the conditional distribution heterogeneity (i.e. concept shift), which can result in suboptimal performance when the distribution of input data across clients diverges (i.e. covariate shift). Additionally, these techniques often lack the ability to adapt to unseen data, further limiting their effectiveness in real-world scenarios. To address these limitations, we propose a novel approach, FedGMM, which utilizes Gaussian mixture models (GMM) to effectively fit the input data distributions across diverse clients. The model parameters are estimated by maximum likelihood estimation utilizing a federated Expectation-Maximization algorithm, which is solved in closed form and does not assume gradient similarity. Furthermore, FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification. Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection., International Conference on Machine Learning (ICML'23)
- Published
- 2023
19. Cortical abnormalities in Parkinson’s disease patients and relationship to depression: A surface-based morphometry study
- Author
-
Huang, Peiyu, Lou, Yuting, Xuan, Min, Gu, Quanquan, Guan, Xiaojun, Xu, Xiaojun, Song, Zhe, Luo, Wei, and Zhang, Minming
- Published
- 2016
- Full Text
- View/download PDF
20. The Benefits of Mixup for Feature Learning
- Author
-
Zou, Difan, Cao, Yuan, Li, Yuanzhi, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Optimization and Control (math.OC) ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Mixup, a simple data augmentation method that randomly mixes two data points via linear interpolation, has been extensively applied in various deep learning applications to gain better generalization. However, the theoretical underpinnings of its efficacy are not yet fully understood. In this paper, we aim to seek a fundamental understanding of the benefits of Mixup. We first show that Mixup using different linear interpolation parameters for features and labels can still achieve similar performance to the standard Mixup. This indicates that the intuitive linearity explanation in Zhang et al., (2018) may not fully explain the success of Mixup. Then we perform a theoretical study of Mixup from the feature learning perspective. We consider a feature-noise data model and show that Mixup training can effectively learn the rare features (appearing in a small fraction of data) from its mixture with the common features (appearing in a large fraction of data). In contrast, standard training can only learn the common features but fails to learn the rare features, thus suffering from bad generalization performance. Moreover, our theoretical analysis also shows that the benefits of Mixup for feature learning are mostly gained in the early training phase, based on which we propose to apply early stopping in Mixup. Experimental results verify our theoretical findings and demonstrate the effectiveness of the early-stopped Mixup training., 72 pages, 4 figures
- Published
- 2023
21. Benign Overfitting for Two-layer ReLU Networks
- Author
-
Kou, Yiwen, Chen, Zixiang, Chen, Yuanzhou, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Modern deep learning models with great expressive power can be trained to overfit the training data but still generalize well. This phenomenon is referred to as benign overfitting. Recently, a few studies have attempted to theoretically understand benign overfitting in neural networks. However, these works are either limited to neural networks with smooth activation functions or to the neural tangent kernel regime. How and when benign overfitting can occur in ReLU neural networks remains an open problem. In this work, we seek to answer this question by establishing algorithm-dependent risk bounds for learning two-layer ReLU convolutional neural networks with label-flipping noise. We show that, under mild conditions, the neural network trained by gradient descent can achieve near-zero training loss and Bayes optimal test risk. Our result also reveals a sharp transition between benign and harmful overfitting under different conditions on data distribution in terms of test risk. Experiments on synthetic data back up our theory., 54 pages, 2 figures, 2 tables
- Published
- 2023
22. Finite-Sample Analysis of Learning High-Dimensional Single ReLU Neuron
- Author
-
Wu, Jingfeng, Zou, Difan, Chen, Zixiang, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
This paper considers the problem of learning a single ReLU neuron with squared loss (a.k.a., ReLU regression) in the overparameterized regime, where the input dimension can exceed the number of samples. We analyze a Perceptron-type algorithm called GLM-tron (Kakade et al., 2011) and provide its dimension-free risk upper bounds for high-dimensional ReLU regression in both well-specified and misspecified settings. Our risk bounds recover several existing results as special cases. Moreover, in the well-specified setting, we provide an instance-wise matching risk lower bound for GLM-tron. Our upper and lower risk bounds provide a sharp characterization of the high-dimensional ReLU regression problems that can be learned via GLM-tron. On the other hand, we provide some negative results for stochastic gradient descent (SGD) for ReLU regression with symmetric Bernoulli data: if the model is well-specified, the excess risk of SGD is provably no better than that of GLM-tron ignoring constant factors, for each problem instance; and in the noiseless case, GLM-tron can achieve a small risk while SGD unavoidably suffers from a constant risk in expectation. These results together suggest that GLM-tron might be preferable to SGD for high-dimensional ReLU regression., ICML 2023 camera ready
- Published
- 2023
23. Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
- Author
-
Zhao, Heyang, He, Jiafan, Zhou, Dongruo, Zhang, Tong, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Recently, several studies (Zhou et al., 2021a; Zhang et al., 2021b; Kim et al., 2021; Zhou and Gu, 2022) have provided variance-dependent regret bounds for linear contextual bandits, which interpolates the regret for the worst-case regime and the deterministic reward regime. However, these algorithms are either computationally intractable or unable to handle unknown variance of the noise. In this paper, we present a novel solution to this open problem by proposing the first computationally efficient algorithm for linear bandits with heteroscedastic noise. Our algorithm is adaptive to the unknown variance of noise and achieves an $\tilde{O}(d \sqrt{\sum_{k = 1}^K \sigma_k^2} + d)$ regret, where $\sigma_k^2$ is the variance of the noise at the round $k$, $d$ is the dimension of the contexts and $K$ is the total number of rounds. Our results are based on an adaptive variance-aware confidence set enabled by a new Freedman-type concentration inequality for self-normalized martingales and a multi-layer structure to stratify the context vectors into different layers with different uniform upper bounds on the uncertainty. Furthermore, our approach can be extended to linear mixture Markov decision processes (MDPs) in reinforcement learning. We propose a variance-adaptive algorithm for linear mixture MDPs, which achieves a problem-dependent horizon-free regret bound that can gracefully reduce to a nearly constant regret for deterministic MDPs. Unlike existing nearly minimax optimal algorithms for linear mixture MDPs, our algorithm does not require explicit variance estimation of the transitional probabilities or the use of high-order moment estimators to attain horizon-free regret. We believe the techniques developed in this paper can have independent value for general online decision making problems., Comment: 43 pages, 2 tables
- Published
- 2023
24. Genetic impacts on nigral iron deposition in Parkinson's disease: A preliminary quantitative susceptibility mapping study.
- Author
-
Wu, Jingjing, Guo, Tao, Zhou, Cheng, Bai, Xueqin, Liu, Xiaocao, Gu, Luyan, Xuan, Min, Gu, Quanquan, Huang, Peiyu, Song, Zhe, Zhang, Baorong, Xu, Xiaojun, Zhang, Minming, and Guan, Xiaojun
- Subjects
IRON ,PARKINSON'S disease ,GENOTYPE-environment interaction ,SINGLE nucleotide polymorphisms ,SUBSTANTIA nigra ,IRON metabolism - Abstract
Background: Dysfunction of iron metabolism, especially in substantia nigra (SN), is widely acknowledged in Parkinson's disease (PD), but the genetic influence on iron deposition remains largely unknown. Thus, in this study, we aimed to investigate potential genetic impacts on iron deposition in PD. Methods: Seventy‐four subjects, including 38 patients with PD and 36 age‐matched normal controls, participated in this study. Imaging genetic association analysis was used to identify the specific influence of single nucleotide polymorphism (SNP) on iron‐related quantitative traits (QT). Genetic effects on iron deposition at the disease level, SNP level, and their interactive effect were highlighted. Results: Four strong SNP–QT associations were detected: rs602201‐susceptibility of bilateral SN, rs198440‐susceptibility of left SN, and rs7895403‐susceptibility of left caudate head. Detailed analyses showed that: (1) significant iron deposition was exclusively found in bilateral SN in PD; (2) altered polymorphisms of the A allele/A− genotype of rs602201 and G allele/G− genotype of rs198440 and rs7895403 were more frequently observed in PD; (3) for rs602201, among all subjects, A− genotype carriers showed significantly increased iron content than TT genotype in bilateral SN; for rs198440 and rs7895403, G− carriers showed increased iron content than AA genotype in left SN and left caudate head, respectively; and (4) rs602201 exhibited significant SNP‐by‐disease interaction in bilateral SN. Conclusions: This study shows that rs602201 and rs198440 have a stimulative impact on nigral iron deposition in PD, which provides improved understanding of iron‐related pathogenesis in PD, and specifically, that vulnerability to iron deposition in SN is genetic‐based. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
25. Abnormal amygdala function in Parkinson's disease patients and its relationship to depression
- Author
-
Huang, Peiyu, Xuan, Min, Gu, Quanquan, Yu, Xinfeng, Xu, Xiaojun, Luo, Wei, and Zhang, Minming
- Published
- 2015
- Full Text
- View/download PDF
26. Longitudinal Alterations of Local Spontaneous Brain Activity in Parkinson’s Disease
- Author
-
Zeng, Qiaoling, Guan, Xiaojun, Law Yan Lun, Jason C. F., Shen, Zhujing, Guo, Tao, Xuan, Min, Gu, Quanquan, Xu, Xiaojun, Chen, Min, and Zhang, Minming
- Published
- 2017
- Full Text
- View/download PDF
27. A General Framework for Sample-Efficient Function Approximation in Reinforcement Learning
- Author
-
Chen, Zixiang, Li, Chris Junchi, Yuan, Angela, Gu, Quanquan, and Jordan, Michael I.
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
With the increasing need for handling large state and action spaces, general function approximation has become a key technique in reinforcement learning (RL). In this paper, we propose a general framework that unifies model-based and model-free RL, and an Admissible Bellman Characterization (ABC) class that subsumes nearly all Markov Decision Process (MDP) models in the literature for tractable RL. We propose a novel estimation function with decomposable structural properties for optimization-based exploration and the functional eluder dimension as a complexity measure of the ABC class. Under our framework, a new sample-efficient algorithm namely OPtimization-based ExploRation with Approximation (OPERA) is proposed, achieving regret bounds that match or improve over the best-known results for a variety of MDP models. In particular, for MDPs with low Witness rank, under a slightly stronger assumption, OPERA improves the state-of-the-art sample complexity results by a factor of $dH$. Our framework provides a generic interface to design and analyze new RL models and algorithms.
- Published
- 2022
28. The Power and Limitation of Pretraining-Finetuning for Linear Regression under Covariate Shift
- Author
-
Wu, Jingfeng, Zou, Difan, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study linear regression under covariate shift, where the marginal distribution over the input covariates differs in the source and the target domains, while the conditional distribution of the output given the input covariates is similar across the two domains. We investigate a transfer learning approach with pretraining on the source data and finetuning based on the target data (both conducted by online SGD) for this problem. We establish sharp instance-dependent excess risk upper and lower bounds for this approach. Our bounds suggest that for a large class of linear regression instances, transfer learning with $O(N^2)$ source data (and scarce or no target data) is as effective as supervised learning with $N$ target data. In addition, we show that finetuning, even with only a small amount of target data, could drastically reduce the amount of source data required by pretraining. Our theory sheds light on the effectiveness and limitation of pretraining as well as the benefits of finetuning for tackling covariate shift problems., 32 pages, 1 figure, 1 table
- Published
- 2022
29. On the Convergence of Certified Robust Training with Interval Bound Propagation
- Author
-
Wang, Yihan, Shi, Zhouxing, Gu, Quanquan, and Hsieh, Cho-Jui
- Subjects
FOS: Computer and information sciences ,Statistics::Machine Learning ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
Interval Bound Propagation (IBP) is so far the base of state-of-the-art methods for training neural networks with certifiable robustness guarantees when potential adversarial perturbations present, while the convergence of IBP training remains unknown in existing literature. In this paper, we present a theoretical analysis on the convergence of IBP training. With an overparameterized assumption, we analyze the convergence of IBP robust training. We show that when using IBP training to train a randomly initialized two-layer ReLU neural network with logistic loss, gradient descent can linearly converge to zero robust training error with a high probability if we have sufficiently small perturbation radius and large network width., ICLR 2022
- Published
- 2022
30. Optimal Online Generalized Linear Regression with Stochastic Noise and Its Application to Heteroscedastic Bandits
- Author
-
Zhao, Heyang, Zhou, Dongruo, He, Jiafan, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study the problem of online generalized linear regression in the stochastic setting, where the label is generated from a generalized linear model with possibly unbounded additive noise. We provide a sharp analysis of the classical follow-the-regularized-leader (FTRL) algorithm to cope with the label noise. More specifically, for $\sigma$-sub-Gaussian label noise, our analysis provides a regret upper bound of $O(\sigma^2 d \log T) + o(\log T)$, where $d$ is the dimension of the input vector, $T$ is the total number of rounds. We also prove a $\Omega(\sigma^2d\log(T/d))$ lower bound for stochastic online linear regression, which indicates that our upper bound is nearly optimal. In addition, we extend our analysis to a more refined Bernstein noise condition. As an application, we study generalized linear bandits with heteroscedastic noise and propose an algorithm based on FTRL to achieve the first variance-aware regret bound., Comment: 27 pages, 3 figures. In this updated version, we have changed the paper title, added new theoretical results on the FTRL algorithm and mainly focused on stochastic online regression. Refer to arXiv:2202.13603v1 for the previous version, which contains more results on heteroscedastic nonlinear bandits
- Published
- 2022
31. Benign Overfitting in Two-layer Convolutional Neural Networks
- Author
-
Cao, Yuan, Chen, Zixiang, Belkin, Mikhail, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Modern neural networks often have great expressive power and can be trained to overfit the training data, while still achieving a good test performance. This phenomenon is referred to as "benign overfitting". Recently, there emerges a line of works studying "benign overfitting" from the theoretical perspective. However, they are limited to linear models or kernel/random feature models, and there is still a lack of theoretical understanding about when and how benign overfitting occurs in neural networks. In this paper, we study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN). We show that when the signal-to-noise ratio satisfies a certain condition, a two-layer CNN trained by gradient descent can achieve arbitrarily small training and test loss. On the other hand, when this condition does not hold, overfitting becomes harmful and the obtained CNN can only achieve a constant level test loss. These together demonstrate a sharp phase transition between benign overfitting and harmful overfitting, driven by the signal-to-noise ratio. To the best of our knowledge, this is the first work that precisely characterizes the conditions under which benign overfitting can occur in training convolutional neural networks., 42 pages, 1 figure. Version 3 improves the presentation and adds a comparison with a concurrent work
- Published
- 2022
32. The United States COVID-19 Forecast Hub dataset
- Author
-
US COVID-19 Forecast Hub Consortium, Cramer, Estee Y., Huang, Yuxin, Wang, Yijin, Ray, Evan L., Cornell, Matthew, Bracher, Johannes, Brennen, Andrea, Rivadeneira, Alvaro J. Castro, Gerding, Aaron, House, Katie, Jayawardena, Dasuni, Kanji, Abdul Hannan, Khandelwal, Ayush, Le, Khoa, Mody, Vidhi, Mody, Vrushti, Niemi, Jarad, Stark, Ariane, Shah, Apurv, Wattanchit, Nutcha, Zorn, Martha W., Reich, Nicholas G., Gneiting, Tilmann, Mühlemann, Anja, Gu, Youyang, Chen, Yixian, Chintanippu, Krishna, Jivane, Viresh, Khurana, Ankita, Kumar, Ajay, Lakhani, Anshul, Mehrotra, Prakhar, Pasumarty, Sujitha, Shrivastav, Monika, You, Jialu, Bannur, Nayana, Deva, Ayush, Jain, Sansiddh, Kulkarni, Mihir, Merugu, Srujana, Raval, Alpan, Shingi, Siddhant, Tiwari, Avtansh, White, Jerome, Adiga, Aniruddha, Hurt, Benjamin, Lewis, Bryan, Marathe, Madhav, Peddireddy, Akhil Sai, Porebski, Przemyslaw, Venkatramanan, Srinivasan, Wang, Lijing, Dahan, Maytal, Fox, Spencer, Gaither, Kelly, Lachmann, Michael, Meyers, Lauren Ancel, Scott, James G., Tec, Mauricio, Woody, Spencer, Srivastava, Ajitesh, Xu, Tianjian, Cegan, Jeffrey C., Dettwiller, Ian D., England, William P., Farthing, Matthew W., George, Glover E., Hunter, Robert H., Lafferty, Brandon, Linkov, Igor, Mayo, Michael L., Parno, Matthew D., Rowland, Michael A., Trump, Benjamin D., Chen, Samuel, Faraone, Stephen V., Hess, Jonathan, Morley, Christopher P., Salekin, Asif, Wang, Dongliang, Zhang-James, Yanli, Baer, Thomas M., Corsetti, Sabrina M., Eisenberg, Marisa C., Falb, Karl, Huang, Yitao, Martin, Emily T., McCauley, Ella, Myers, Robert L., Schwarz, Tom, Gibson, Graham Casey, Sheldon, Daniel, Gao, Liyao, Ma, Yian, Wu, Dongxia, Yu, Rose, Jin, Xiaoyong, Wang, Yu-Xiang, Yan, Xifeng, Chen, YangQuan, Guo, Lihong, Zhao, Yanting, Chen, Jinghui, Gu, Quanquan, Wang, Lingxiao, Xu, Pan, Zhang, Weitong, Zou, Difan, Chattopadhyay, Ishanu, Huang, Yi, Lu, Guoqing, Pfeiffer, Ruth, Sumner, Timothy, Wang, Dongdong, Wang, Liqiang, Zhang, Shunpu, Zou, Zihang, Biegel, Hannah, Lega, Joceline, Hussain, Fazle, Khan, Zeina, Van Bussel, Frank, McConnell, Steve, Guertin, Stephanie L., Hulme-Lowe, Christopher, Nagraj, V. P., Turner, Stephen D., Bejar, Benjamín, Choirat, Christine, Flahault, Antoine, Krymova, Ekaterina, Lee, Gavin, Manetti, Elisa, Namigai, Kristen, Obozinski, Guillaume, Sun, Tao, Thanou, Dorina, Ban, Xuegang, Shi, Yunfeng, Walraven, Robert, Hong, Qi-Jun, Van De Walle, Axel, Ben-Nun, Michal, Riley, Steven, Riley, Pete, Turtle, James, Cao, Duy, Galasso, Joseph, Cho, Jae H., Jo, Areum, DesRoches, David, Forli, Pedro, Hamory, Bruce, Koyluoglu, Ugur, Kyriakides, Christina, Leis, Helen, Milliken, John, Moloney, Michael, Morgan, James, Nirgudkar, Ninad, Ozcan, Gokce, Piwonka, Noah, Ravi, Matt, Schrader, Chris, Shakhnovich, Elizabeth, Siegel, Daniel, Spatz, Ryan, Stiefeling, Chris, Wilkinson, Barrie, Wong, Alexander, Cavany, Sean, España, Guido, Moore, Sean, Oidtman, Rachel, Perkins, Alex, Ivy, Julie S., Mayorga, Maria E., Mele, Jessica, Rosenstrom, Erik T., Swann, Julie L., Kraus, Andrea, Kraus, David, Bian, Jiang, Cao, Wei, Gao, Zhifeng, Ferres, Juan Lavista, Li, Chaozhuo, Liu, Tie-Yan, Xie, Xing, Zhang, Shun, Zheng, Shun, Chinazzi, Matteo, Vespignani, Alessandro, Xiong, Xinyue, Davis, Jessica T., Mu, Kunpeng, Piontti, Ana Pastore Y, Baek, Jackie, Farias, Vivek, Georgescu, Andreea, Levi, Retsef, Sinha, Deeksha, Wilde, Joshua, Zheng, Andrew, Lami, Omar Skali, Bennouna, Amine, Ndong, David Nze, Perakis, Georgia, Singhvi, Divya, Spantidakis, Ioannis, Thayaparan, Leann, Tsiourvas, Asterios, Weisberg, Shane, Jadbabaie, Ali, Sarker, Arnab, Shah, Devavrat, Celi, Leo A., Penna, Nicolas D., Sundar, Saketh, Berlin, Abraham, Gandhi, Parth D., McAndrew, Thomas, Piriya, Matthew, Chen, Ye, Hlavacek, William, Lin, Yen Ting, Mallela, Abhishek, Miller, Ely, Neumann, Jacob, Posner, Richard, Wolfinger, Russ, Castro, Lauren, Fairchild, Geoffrey, Michaud, Isaac, Osthus, Dave, Wolffram, Daniel, Karlen, Dean, Panaggio, Mark J., Kinsey, Matt, Mullany, Luke C., Rainwater-Lovett, Kaitlin, Shin, Lauren, Tallaksen, Katharine, Wilson, Shelby, Brenner, Michael, Coram, Marc, Edwards, Jessie K., Joshi, Keya, Klein, Ellen, Hulse, Juan Dent, Grantz, Kyra H., Hill, Alison L., Kaminsky, Kathryn, Kaminsky, Joshua, Keegan, Lindsay T., Lauer, Stephen A., Lee, Elizabeth C., Lemaitre, Joseph C., Lessler, Justin, Meredith, Hannah R., Perez-Saez, Javier, Shah, Sam, Smith, Claire P., Truelove, Shaun A., Wills, Josh, Gardner, Lauren, Marshall, Maximilian, Nixon, Kristen, Burant, John C., Budzinski, Jozef, Chiang, Wen-Hao, Mohler, George, Gao, Junyi, Glass, Lucas, Qian, Cheng, Romberg, Justin, Sharma, Rakshith, Spaeder, Jeffrey, Sun, Jimeng, Xiao, Cao, Gao, Lei, Gu, Zhiling, Kim, Myungjin, Li, Xinyi, Wang, Yueying, Wang, Guannan, Wang, Lily, Yu, Shan, Jain, Chaman, Bhatia, Sangeeta, Nouvellet, Pierre, Barber, Ryan, Gaikedu, Emmanuela, Hay, Simon, Lim, Steve, Murray, Chris, Pigott, David, Reiner, Robert C., Baccam, Prasith, Gurung, Heidi L., Stage, Steven A., Suchoski, Bradley T., Fong, Chung-Yan, Yeung, Dit-Yan, Adhikari, Bijaya, Cui, Jiaming, Prakash, B. Aditya, Rodríguez, Alexander, Tabassum, Anika, Xie, Jiajia, Asplund, John, Baxter, Arden, Keskinocak, Pinar, Oruc, Buse Eylul, Serban, Nicoleta, Arik, Sercan O., Dusenberry, Mike, Epshteyn, Arkady, Kanal, Elli, Le, Long T., Li, Chun-Liang, Pfister, Tomas, Sinha, Rajarishi, Tsai, Thomas, Yoder, Nate, Yoon, Jinsung, Zhang, Leyou, Wilson, Daniel, Belov, Artur A., Chow, Carson C., Gerkin, Richard C., Yogurtcu, Osman N., Ibrahim, Mark, Lacroix, Timothee, Le, Matthew, Liao, Jason, Nickel, Maximilian, Sagun, Levent, Abbott, Sam, Bosse, Nikos I., Funk, Sebastian, Hellewell, Joel, Meakin, Sophie R., Sherratt, Katharine, Kalantari, Rahi, Zhou, Mingyuan, Karimzadeh, Morteza, Lucas, Benjamin, Ngo, Thoai, Zoraghein, Hamidreza, Vahedi, Behzad, Wang, Zhongying, Pei, Sen, Shaman, Jeffrey, Yamana, Teresa K., Bertsimas, Dimitris, Li, Michael L., Soni, Saksham, Bouardi, Hamza Tazi, Adee, Madeline, Ayer, Turgay, Chhatwal, Jagpreet, Dalgic, Ozden O., Ladd, Mary A., Linas, Benjamin P., Mueller, Peter, Xiao, Jade, Bosch, Jurgen, Wilson, Austin, Zimmerman, Peter, Wang, Qinxia, Wang, Yuanjia, Xie, Shanghong, Zeng, Donglin, Bien, Jacob, Brooks, Logan, Green, Alden, Hu, Addison J., Jahja, Maria, McDonald, Daniel, Narasimhan, Balasubramanian, Politsch, Collin, Rajanala, Samyak, Rumack, Aaron, Simon, Noah, Tibshirani, Ryan J., Tibshirani, Rob, Ventura, Valerie, Wasserman, Larry, Drake, John M., O’Dea, Eamon B., Abu-Mostafa, Yaser, Bathwal, Rahil, Chang, Nicholas A., Chitta, Pavan, Erickson, Anne, Goel, Sumit, Gowda, Jethin, Jin, Qixuan, Jo, HyeongChan, Kim, Juhyun, Kulkarni, Pranav, Lushtak, Samuel M., Mann, Ethan, Popken, Max, Soohoo, Connor, Tirumala, Kushal, Tseng, Albert, Varadarajan, Vignesh, Vytheeswaran, Jagath, Wang, Christopher, Yeluri, Akshay, Yurk, Dominic, Zhang, Michael, Zlokapa, Alexander, Pagano, Robert, Jain, Chandini, Tomar, Vishal, Ho, Lam, Huynh, Huong, Tran, Quoc, Lopez, Velma K., Walker, Jo W., Slayton, Rachel B., Johansson, Michael A., and Biggerstaff, Matthew
- Subjects
ddc:510 ,Mathematics - Abstract
Academic researchers, government agencies, industry groups, and individuals have produced forecasts at an unprecedented scale during the COVID-19 pandemic. To leverage these forecasts, the United States Centers for Disease Control and Prevention (CDC) partnered with an academic research lab at the University of Massachusetts Amherst to create the US COVID-19 Forecast Hub. Launched in April 2020, the Forecast Hub is a dataset with point and probabilistic forecasts of incident cases, incident hospitalizations, incident deaths, and cumulative deaths due to COVID-19 at county, state, and national, levels in the United States. Included forecasts represent a variety of modeling approaches, data sources, and assumptions regarding the spread of COVID-19. The goal of this dataset is to establish a standardized and comparable set of short-term forecasts from modeling teams. These data can be used to develop ensemble models, communicate forecasts to the public, create visualizations, compare models, and inform policies regarding COVID-19 mitigation. These open-source data are available via download from GitHub, through an online API, and through R packages.
- Published
- 2022
33. Abnormal baseline brain activity in Parkinsonʼs disease with and without REM sleep behavior disorder: A resting‐state functional MRI study
- Author
-
Li, Dan, Huang, Peiyu, Zang, Yufeng, Lou, Yuting, Cen, Zhidong, Gu, Quanquan, Xuan, Min, Xie, Fei, Ouyang, Zhiyuan, Wang, Bo, Zhang, Minming, and Luo, Wei
- Published
- 2017
- Full Text
- View/download PDF
34. Learning Stochastic Shortest Path with Linear Function Approximation
- Author
-
Min, Yifei, He, Jiafan, Wang, Tianhao, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study the stochastic shortest path (SSP) problem in reinforcement learning with linear function approximation, where the transition kernel is represented as a linear mixture of unknown models. We call this class of SSP problems as linear mixture SSPs. We propose a novel algorithm with Hoeffding-type confidence sets for learning the linear mixture SSP, which can attain an $\tilde{\mathcal{O}}(d B_{\star}^{1.5}\sqrt{K/c_{\min}})$ regret. Here $K$ is the number of episodes, $d$ is the dimension of the feature mapping in the mixture model, $B_{\star}$ bounds the expected cumulative cost of the optimal policy, and $c_{\min}>0$ is the lower bound of the cost function. Our algorithm also applies to the case when $c_{\min} = 0$, and an $\tilde{\mathcal{O}}(K^{2/3})$ regret is guaranteed. To the best of our knowledge, this is the first algorithm with a sublinear regret guarantee for learning linear mixture SSP. Moreover, we design a refined Bernstein-type confidence set and propose an improved algorithm, which provably achieves an $\tilde{\mathcal{O}}(d B_{\star}\sqrt{K/c_{\min}})$ regret. In complement to the regret upper bounds, we also prove a lower bound of $\Omega(dB_{\star} \sqrt{K})$. Hence, our improved algorithm matches the lower bound up to a $1/\sqrt{c_{\min}}$ factor and poly-logarithmic factors, achieving a near-optimal regret guarantee., Comment: 46 pages, 1 figure. In ICML 2022
- Published
- 2021
35. Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes
- Author
-
Liao, Chonghua, He, Jiafan, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Computer Science - Cryptography and Security ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Cryptography and Security (cs.CR) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users' private and sensitive data. To protect the users' privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. We propose a novel $(\varepsilon, \delta)$-LDP algorithm for learning a class of Markov decision processes (MDPs) dubbed linear mixture MDPs, and obtains an $\tilde{\mathcal{O}}( d^{5/4}H^{7/4}T^{3/4}\left(\log(1/\delta)\right)^{1/4}\sqrt{1/\varepsilon})$ regret, where $d$ is the dimension of feature mapping, $H$ is the length of the planning horizon, and $T$ is the number of interactions with the environment. We also prove a lower bound $\Omega(dH\sqrt{T}/\left(e^{\varepsilon}(e^{\varepsilon}-1)\right))$ for learning linear mixture MDPs under $\varepsilon$-LDP constraint. Experiments on synthetic datasets verify the effectiveness of our algorithm. To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation., Comment: 25 pages, 2 figures
- Published
- 2021
36. Last Iterate Risk Bounds of SGD with Decaying Stepsize for Overparameterized Linear Regression
- Author
-
Wu, Jingfeng, Zou, Difan, Braverman, Vladimir, Gu, Quanquan, and Kakade, Sham M.
- Subjects
FOS: Computer and information sciences ,Computer Science::Machine Learning ,Statistics::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Stochastic gradient descent (SGD) has been shown to generalize well in many deep learning applications. In practice, one often runs SGD with a geometrically decaying stepsize, i.e., a constant initial stepsize followed by multiple geometric stepsize decay, and uses the last iterate as the output. This kind of SGD is known to be nearly minimax optimal for classical finite-dimensional linear regression problems (Ge et al., 2019). However, a sharp analysis for the last iterate of SGD in the overparameterized setting is still open. In this paper, we provide a problem-dependent analysis on the last iterate risk bounds of SGD with decaying stepsize, for (overparameterized) linear regression problems. In particular, for last iterate SGD with (tail) geometrically decaying stepsize, we prove nearly matching upper and lower bounds on the excess risk. Moreover, we provide an excess risk lower bound for last iterate SGD with polynomially decaying stepsize and demonstrate the advantage of geometrically decaying stepsize in an instance-wise manner, which complements the minimax rate comparison made in prior works., 35 pages, 2 figures, 1 table. In ICML 2022
- Published
- 2021
37. Reward-Free Model-Based Reinforcement Learning with Linear Function Approximation
- Author
-
Zhang, Weitong, Zhou, Dongruo, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Optimization and Control (math.OC) ,Computer Science - Artificial Intelligence ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study the model-based reward-free reinforcement learning with linear function approximation for episodic Markov decision processes (MDPs). In this setting, the agent works in two phases. In the exploration phase, the agent interacts with the environment and collects samples without the reward. In the planning phase, the agent is given a specific reward function and uses samples collected from the exploration phase to learn a good policy. We propose a new provably efficient algorithm, called UCRL-RFE under the Linear Mixture MDP assumption, where the transition probability kernel of the MDP can be parameterized by a linear function over certain feature mappings defined on the triplet of state, action, and next state. We show that to obtain an $\epsilon$-optimal policy for arbitrary reward function, UCRL-RFE needs to sample at most $\tilde{\mathcal{O}}(H^5d^2\epsilon^{-2})$ episodes during the exploration phase. Here, $H$ is the length of the episode, $d$ is the dimension of the feature mapping. We also propose a variant of UCRL-RFE using Bernstein-type bonus and show that it needs to sample at most $\tilde{\mathcal{O}}(H^4d(H + d)\epsilon^{-2})$ to achieve an $\epsilon$-optimal policy. By constructing a special class of linear Mixture MDPs, we also prove that for any reward-free algorithm, it needs to sample at least $\tilde \Omega(H^2d\epsilon^{-2})$ episodes to obtain an $\epsilon$-optimal policy. Our upper bound matches the lower bound in terms of the dependence on $\epsilon$ and the dependence on $d$ if $H \ge d$., Comment: 29 pages, 1 figure, 1 table. In NeurIPS 2021
- Published
- 2021
38. Adaptive Sampling for Heterogeneous Rank Aggregation from Noisy Pairwise Comparisons
- Author
-
Wu, Yue, Jin, Tao, Lou, Hao, Xu, Pan, Farnoud, Farzad, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Machine Learning (cs.LG) - Abstract
In heterogeneous rank aggregation problems, users often exhibit various accuracy levels when comparing pairs of items. Thus a uniform querying strategy over users may not be optimal. To address this issue, we propose an elimination-based active sampling strategy, which estimates the ranking of items via noisy pairwise comparisons from users and improves the users' average accuracy by maintaining an active set of users. We prove that our algorithm can return the true ranking of items with high probability. We also provide a sample complexity bound for the proposed algorithm which is better than that of non-active strategies in the literature. Experiments are provided to show the empirical advantage of the proposed methods over the state-of-the-art baselines.
- Published
- 2021
39. Iterative Teacher-Aware Learning
- Author
-
Yuan, Luyao, Zhou, Dongruo, Shen, Junhong, Gao, Jingdong, Chen, Jeffrey L., Gu, Quanquan, Wu, Ying Nian, and Zhu, Song-Chun
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Artificial Intelligence (cs.AI) ,Computer Science - Artificial Intelligence ,ComputingMilieux_COMPUTERSANDEDUCATION ,Computer Science - Human-Computer Interaction ,Machine Learning (cs.LG) ,Human-Computer Interaction (cs.HC) - Abstract
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. The teacher adjusts her teaching method for different students, and the student, after getting familiar with the teacher's instruction mechanism, can infer the teacher's intention to learn faster. Recently, the benefits of integrating this cooperative pedagogy into machine concept learning in discrete spaces have been proved by multiple works. However, how cooperative pedagogy can facilitate machine parameter learning hasn't been thoroughly studied. In this paper, we propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function and learn provably faster compared with the naive learning algorithms used in previous machine teaching works. We give theoretical proof that the iterative teacher-aware learning (ITAL) process leads to local and global improvements. We then validate our algorithms with extensive experiments on various tasks including regression, classification, and inverse reinforcement learning using synthetic and real data. We also show the advantage of modeling teacher-awareness when agents are learning from human teachers.
- Published
- 2021
40. Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
- Author
-
Zou, Difan, Cao, Yuan, Li, Yuanzhi, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Adaptive gradient methods such as Adam have gained increasing popularity in deep learning optimization. However, it has been observed that compared with (stochastic) gradient descent, Adam can converge to a different solution with a significantly worse test error in many deep learning applications such as image classification, even with a fine-tuned regularization. In this paper, we provide a theoretical explanation for this phenomenon: we show that in the nonconvex setting of learning over-parameterized two-layer convolutional neural networks starting from the same random initialization, for a class of data distributions (inspired from image data), Adam and gradient descent (GD) can converge to different global solutions of the training objective with provably different generalization errors, even with weight decay regularization. In contrast, we show that if the training objective is convex, and the weight decay regularization is employed, any optimization algorithms including Adam and GD will converge to the same solution if the training is successful. This suggests that the inferior generalization performance of Adam is fundamentally tied to the nonconvex landscape of deep learning optimization., 42 pages, 2 figures and 1 table
- Published
- 2021
41. The Benefits of Implicit Regularization from SGD in Least Squares Problems
- Author
-
Zou, Difan, Wu, Jingfeng, Braverman, Vladimir, Gu, Quanquan, Foster, Dean P., and Kakade, Sham M.
- Subjects
FOS: Computer and information sciences ,Statistics::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Stochastic gradient descent (SGD) exhibits strong algorithmic regularization effects in practice, which has been hypothesized to play an important role in the generalization of modern machine learning approaches. In this work, we seek to understand these issues in the simpler setting of linear regression (including both underparameterized and overparameterized regimes), where our goal is to make sharp instance-based comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression. For a broad class of least squares problem instances (that are natural in high-dimensional settings), we show: (1) for every problem instance and for every ridge parameter, (unregularized) SGD, when provided with logarithmically more samples than that provided to the ridge algorithm, generalizes no worse than the ridge solution (provided SGD uses a tuned constant stepsize); (2) conversely, there exist instances (in this wide problem class) where optimally-tuned ridge regression requires quadratically more samples than SGD in order to have the same generalization performance. Taken together, our results show that, up to the logarithmic factors, the generalization performance of SGD is always no worse than that of ridge regression in a wide range of overparameterized problems, and, in fact, could be much better for some problem instances. More generally, our results show how algorithmic regularization has important consequences even in simpler (overparameterized) convex settings., 33 pages, 1 figure. In NeurIPS 2021
- Published
- 2021
42. Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
- Author
-
Frei, Spencer and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,MathematicsofComputing_NUMERICALANALYSIS ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Although the optimization objectives for learning neural networks are highly non-convex, gradient-based methods have been wildly successful at learning neural networks in practice. This juxtaposition has led to a number of recent studies on provable guarantees for neural networks trained by gradient descent. Unfortunately, the techniques in these works are often highly specific to the particular setup in each problem, making it difficult to generalize across different settings. To address this drawback in the literature, we propose a unified non-convex optimization framework for the analysis of neural network training. We introduce the notions of proxy convexity and proxy Polyak-Lojasiewicz (PL) inequalities, which are satisfied if the original objective function induces a proxy objective function that is implicitly minimized when using gradient methods. We show that gradient descent on objectives satisfying proxy convexity or the proxy PL inequality leads to efficient guarantees for proxy objective functions. We further show that many existing guarantees for neural networks trained by gradient descent can be unified through proxy convexity and proxy PL inequalities., 16 pages. Updated presentation, changed results from online SGD to batch GD
- Published
- 2021
43. Pure Exploration in Kernel and Neural Bandits
- Author
-
Zhu, Yinglun, Zhou, Dongruo, Jiang, Ruoxi, Gu, Quanquan, Willett, Rebecca, and Nowak, Robert
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Machine Learning (cs.LG) - Abstract
We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms. To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space and carefully deal with the induced model misspecification. Our approach is conceptually very different from existing works that can either only handle low-dimensional linear bandits or passively deal with model misspecification. We showcase the application of our approach to two pure exploration settings that were previously under-studied: (1) the reward function belongs to a possibly infinite-dimensional Reproducing Kernel Hilbert Space, and (2) the reward function is nonlinear and can be approximated by neural networks. Our main results provide sample complexity guarantees that only depend on the effective dimension of the feature spaces in the kernel or neural representations. Extensive experiments conducted on both synthetic and real-world datasets demonstrate the efficacy of our methods.
- Published
- 2021
44. Variance-Aware Off-Policy Evaluation with Linear Function Approximation
- Author
-
Min, Yifei, Wang, Tianhao, Zhou, Dongruo, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study the off-policy evaluation (OPE) problem in reinforcement learning with linear function approximation, which aims to estimate the value function of a target policy based on the offline data collected by a behavior policy. We propose to incorporate the variance information of the value function to improve the sample efficiency of OPE. More specifically, for time-inhomogeneous episodic linear Markov decision processes (MDPs), we propose an algorithm, VA-OPE, which uses the estimated variance of the value function to reweight the Bellman residual in Fitted Q-Iteration. We show that our algorithm achieves a tighter error bound than the best-known result. We also provide a fine-grained characterization of the distribution shift between the behavior policy and the target policy. Extensive numerical experiments corroborate our theory., 59 pages, 4 figures. In NeurIPS 2021
- Published
- 2021
45. Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures
- Author
-
Cao, Yuan, Gu, Quanquan, and Belkin, Mikhail
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,Machine Learning (cs.LG) - Abstract
Modern machine learning systems such as deep neural networks are often highly over-parameterized so that they can fit the noisy training data exactly, yet they can still achieve small test errors in practice. In this paper, we study this "benign overfitting" phenomenon of the maximum margin classifier for linear classification problems. Specifically, we consider data generated from sub-Gaussian mixtures, and provide a tight risk bound for the maximum margin linear classifier in the over-parameterized setting. Our results precisely characterize the condition under which benign overfitting can occur in linear classification problems, and improve on previous work. They also have direct implications for over-parameterized logistic regression., 27 pages, 3 figures. In NeurIPS 2021
- Published
- 2021
46. Near-optimal Policy Optimization Algorithms for Learning Adversarial Linear Mixture MDPs
- Author
-
He, Jiafan, Zhou, Dongruo, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Learning Markov decision processes (MDPs) in the presence of the adversary is a challenging problem in reinforcement learning (RL). In this paper, we study RL in episodic MDPs with adversarial reward and full information feedback, where the unknown transition probability function is a linear function of a given feature mapping, and the reward function can change arbitrarily episode by episode. We propose an optimistic policy optimization algorithm POWERS and show that it can achieve $\tilde{O}(dH\sqrt{T})$ regret, where $H$ is the length of the episode, $T$ is the number of interactions with the MDP, and $d$ is the dimension of the feature mapping. Furthermore, we also prove a matching lower bound of $\tilde{\Omega}(dH\sqrt{T})$ up to logarithmic factors. Our key technical contributions are two-fold: (1) a new value function estimator based on importance weighting; and (2) a tighter confidence set for the transition kernel. They together lead to the nearly minimax optimal regret., Comment: 22 pages, 1 figure. In AISTATS 2022
- Published
- 2021
47. Almost Optimal Algorithms for Two-player Zero-Sum Linear Mixture Markov Games
- Author
-
Chen, Zixiang, Zhou, Dongruo, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,TheoryofComputation_MISCELLANEOUS ,Computer Science::Computer Science and Game Theory ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,TheoryofComputation_GENERAL ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study reinforcement learning for two-player zero-sum Markov games with simultaneous moves in the finite-horizon setting, where the transition kernel of the underlying Markov games can be parameterized by a linear function over the current state, both players' actions and the next state. In particular, we assume that we can control both players and aim to find the Nash Equilibrium by minimizing the duality gap. We propose an algorithm Nash-UCRL based on the principle "Optimism-in-Face-of-Uncertainty". Our algorithm only needs to find a Coarse Correlated Equilibrium (CCE), which is computationally efficient. Specifically, we show that Nash-UCRL can provably achieve an $\tilde{O}(dH\sqrt{T})$ regret, where $d$ is the linear function dimension, $H$ is the length of the game and $T$ is the total number of steps in the game. To assess the optimality of our algorithm, we also prove an $\tilde{\Omega}( dH\sqrt{T})$ lower bound on the regret. Our upper bound matches the lower bound up to logarithmic factors, which suggests the optimality of our algorithm., Comment: 35 pages. In ALT 2022
- Published
- 2021
48. Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise
- Author
-
Frei, Spencer, Cao, Yuan, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We consider a one-hidden-layer leaky ReLU network of arbitrary width trained by stochastic gradient descent (SGD) following an arbitrary initialization. We prove that SGD produces neural networks that have classification accuracy competitive with that of the best halfspace over the distribution for a broad class of distributions that includes log-concave isotropic and hard margin distributions. Equivalently, such networks can generalize when the data distribution is linearly separable but corrupted with adversarial label noise, despite the capacity to overfit. To the best of our knowledge, this is the first work to show that overparameterized neural networks trained by SGD can generalize when the data is corrupted with adversarial label noise., 30 pages, 10 figures
- Published
- 2021
49. Nearly Minimax Optimal Reinforcement Learning for Linear Mixture Markov Decision Processes
- Author
-
Zhou, Dongruo, Gu, Quanquan, and Szepesvari, Csaba
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
We study reinforcement learning (RL) with linear function approximation where the underlying transition probability kernel of the Markov decision process (MDP) is a linear mixture model (Jia et al., 2020; Ayoub et al., 2020; Zhou et al., 2020) and the learning agent has access to either an integration or a sampling oracle of the individual basis kernels. We propose a new Bernstein-type concentration inequality for self-normalized martingales for linear bandit problems with bounded noise. Based on the new inequality, we propose a new, computationally efficient algorithm with linear function approximation named $\text{UCRL-VTR}^{+}$ for the aforementioned linear mixture MDPs in the episodic undiscounted setting. We show that $\text{UCRL-VTR}^{+}$ attains an $\tilde O(dH\sqrt{T})$ regret where $d$ is the dimension of feature mapping, $H$ is the length of the episode and $T$ is the number of interactions with the MDP. We also prove a matching lower bound $\Omega(dH\sqrt{T})$ for this setting, which shows that $\text{UCRL-VTR}^{+}$ is minimax optimal up to logarithmic factors. In addition, we propose the $\text{UCLK}^{+}$ algorithm for the same family of MDPs under discounting and show that it attains an $\tilde O(d\sqrt{T}/(1-\gamma)^{1.5})$ regret, where $\gamma\in [0,1)$ is the discount factor. Our upper bound matches the lower bound $\Omega(d\sqrt{T}/(1-\gamma)^{1.5})$ proved by Zhou et al. (2020) up to logarithmic factors, suggesting that $\text{UCLK}^{+}$ is nearly minimax optimal. To the best of our knowledge, these are the first computationally efficient, nearly minimax optimal algorithms for RL with linear function approximation., Comment: 59 pages, 1 figure
- Published
- 2020
50. Logarithmic Regret for Reinforcement Learning with Linear Function Approximation
- Author
-
He, Jiafan, Zhou, Dongruo, and Gu, Quanquan
- Subjects
FOS: Computer and information sciences ,Computer Science::Machine Learning ,Computer Science - Machine Learning ,Optimization and Control (math.OC) ,Statistics - Machine Learning ,FOS: Mathematics ,Machine Learning (stat.ML) ,Mathematics - Optimization and Control ,Machine Learning (cs.LG) - Abstract
Reinforcement learning (RL) with linear function approximation has received increasing attention recently. However, existing work has focused on obtaining $\sqrt{T}$-type regret bound, where $T$ is the number of interactions with the MDP. In this paper, we show that logarithmic regret is attainable under two recently proposed linear MDP assumptions provided that there exists a positive sub-optimality gap for the optimal action-value function. More specifically, under the linear MDP assumption (Jin et al. 2019), the LSVI-UCB algorithm can achieve $\tilde{O}(d^{3}H^5/\text{gap}_{\text{min}}\cdot \log(T))$ regret; and under the linear mixture MDP assumption (Ayoub et al. 2020), the UCRL-VTR algorithm can achieve $\tilde{O}(d^{2}H^5/\text{gap}_{\text{min}}\cdot \log^3(T))$ regret, where $d$ is the dimension of feature mapping, $H$ is the length of episode, $\text{gap}_{\text{min}}$ is the minimal sub-optimality gap, and $\tilde O$ hides all logarithmic terms except $\log(T)$. To the best of our knowledge, these are the first logarithmic regret bounds for RL with linear function approximation. We also establish gap-dependent lower bounds for the two linear MDP models., 26 pages
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.