1. The distribution of the Lasso: Uniform control over sparse balls and adaptive parameter tuning
- Author
-
Léo Miolane, Andrea Montanari, Dynamics of Geometric Networks (DYOGENE), Département d'informatique de l'École normale supérieure (DI-ENS), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria), Stanford University, Département d'informatique - ENS Paris (DI-ENS), École normale supérieure - Paris (ENS-PSL), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Département d'informatique - ENS Paris (DI-ENS), Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS Paris), and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)
- Subjects
Statistics and Probability ,Pointwise ,Gaussian ,Estimator ,020206 networking & telecommunications ,Mathematics - Statistics Theory ,02 engineering and technology ,01 natural sciences ,Regularization (mathematics) ,Stability (probability) ,Empirical distribution function ,010104 statistics & probability ,symbols.namesake ,Lasso (statistics) ,Gaussian noise ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,Applied mathematics ,0101 mathematics ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
The Lasso is a popular regression method for high-dimensional problems in which the number of parameters $\theta_1,\dots,\theta_N$, is larger than the number $n$ of samples: $N>n$. A useful heuristics relates the statistical properties of the Lasso estimator to that of a simple soft-thresholding denoiser,in a denoising problem in which the parameters $(\theta_i)_{i\le N}$ are observed in Gaussian noise, with a carefully tuned variance. Earlier work confirmed this picture in the limit $n,N\to\infty$, pointwise in the parameters $\theta$, and in the value of the regularization parameter. Here, we consider a standard random design model and prove exponential concentration of its empirical distribution around the prediction provided by the Gaussian denoising model. Crucially, our results are uniform with respect to $\theta$ belonging to $\ell_q$ balls, $q\in [0,1]$, and with respect to the regularization parameter. This allows to derive sharp results for the performances of various data-driven procedures to tune the regularization. Our proofs make use of Gaussian comparison inequalities, and in particular of a version of Gordon's minimax theorem developed by Thrampoulidis, Oymak, and Hassibi, which controls the optimum value of the Lasso optimization problem. Crucially, we prove a stability property of the minimizer in Wasserstein distance, that allows to characterize properties of the minimizer itself., Comment: 68 pages, 2 figures
- Published
- 2018