Back to Search Start Over

On false discovery rate thresholding for classification under sparsity

Authors :
Etienne Roquain
Pierre Neuvial
Laboratoire Statistique et Génome (SG)
Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS)
Laboratoire de Probabilités et Modèles Aléatoires (LPMA)
Université Pierre et Marie Curie - Paris 6 (UPMC)-Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS)
French Agence Nationale de la Recherche (ANR) [ANR-09-JCJC-0027-01, ANR-PARCIMONIE, ANR-09-JCJC-0101-01]
French ministry of foreign and European affairs (EGIDE-PROCOPE) [21887]
Source :
Annals of Statistics, Annals of Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Ann. Statist. 40, no. 5 (2012), 2572-2600, Annals of Statistics, Institute of Mathematical Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Annals of Statistics, Institute of Mathematical Statistics, 2012, pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩
Publication Year :
2012
Publisher :
HAL CCSD, 2012.

Abstract

International audience; We study the properties of false discovery rate (FDR) thresholding, viewed as a classification procedure. The ''0''-class (null) is assumed to have a known density while the ''1''-class (alternative) is obtained from the ''0''-class either by translation or by scaling. Furthermore, the ''1''-class is assumed to have a small number of elements w.r.t. the ''0''-class (sparsity). We focus on densities of the Subbotin family, including Gaussian and Laplace models. Nonasymptotic oracle inequalities are derived for the excess risk of FDR thresholding. These inequalities lead to explicit rates of convergence of the excess risk to zero, as the number m of items to be classified tends to infinity and in a regime where the power of the Bayes rule is away from 0 and 1. Moreover, these theoretical investigations suggest an explicit choice for the target level $\alpha_m$ of FDR thresholding, as a function of m. Our oracle inequalities show theoretically that the resulting FDR thresholding adapts to the unknown sparsity regime contained in the data. This property is illustrated with numerical experiments.

Details

Language :
English
ISSN :
00905364 and 21688966
Database :
OpenAIRE
Journal :
Annals of Statistics, Annals of Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Ann. Statist. 40, no. 5 (2012), 2572-2600, Annals of Statistics, Institute of Mathematical Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Annals of Statistics, Institute of Mathematical Statistics, 2012, pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩
Accession number :
edsair.doi.dedup.....b9036423b8014c95fd837daa6c30c098