Back to Search
Start Over
On false discovery rate thresholding for classification under sparsity
- Source :
- Annals of Statistics, Annals of Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Ann. Statist. 40, no. 5 (2012), 2572-2600, Annals of Statistics, Institute of Mathematical Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Annals of Statistics, Institute of Mathematical Statistics, 2012, pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩
- Publication Year :
- 2012
- Publisher :
- HAL CCSD, 2012.
-
Abstract
- International audience; We study the properties of false discovery rate (FDR) thresholding, viewed as a classification procedure. The ''0''-class (null) is assumed to have a known density while the ''1''-class (alternative) is obtained from the ''0''-class either by translation or by scaling. Furthermore, the ''1''-class is assumed to have a small number of elements w.r.t. the ''0''-class (sparsity). We focus on densities of the Subbotin family, including Gaussian and Laplace models. Nonasymptotic oracle inequalities are derived for the excess risk of FDR thresholding. These inequalities lead to explicit rates of convergence of the excess risk to zero, as the number m of items to be classified tends to infinity and in a regime where the power of the Bayes rule is away from 0 and 1. Moreover, these theoretical investigations suggest an explicit choice for the target level $\alpha_m$ of FDR thresholding, as a function of m. Our oracle inequalities show theoretically that the resulting FDR thresholding adapts to the unknown sparsity regime contained in the data. This property is illustrated with numerical experiments.
- Subjects :
- Statistics and Probability
False discovery rate
FOS: Computer and information sciences
Gaussian
02 engineering and technology
oracle inequality
01 natural sciences
Methodology (stat.ME)
010104 statistics & probability
Bayes' theorem
symbols.namesake
adaptive procedure
Convergence (routing)
0202 electrical engineering, electronic engineering, information engineering
Applied mathematics
0101 mathematics
Statistics - Methodology
62H30 (Primary) 62H15 (Secondary)
Mathematics
multiple testing
Null (mathematics)
sparsity
020206 networking & telecommunications
Function (mathematics)
Thresholding
Bayes’ rule
classification
Bayes' rule
Multiple comparisons problem
symbols
62H15
false discovery rate
Statistics, Probability and Uncertainty
[STAT.ME]Statistics [stat]/Methodology [stat.ME]
62H30
Subjects
Details
- Language :
- English
- ISSN :
- 00905364 and 21688966
- Database :
- OpenAIRE
- Journal :
- Annals of Statistics, Annals of Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Ann. Statist. 40, no. 5 (2012), 2572-2600, Annals of Statistics, Institute of Mathematical Statistics, 2012, 40 (5), pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩, Annals of Statistics, Institute of Mathematical Statistics, 2012, pp.Vol. 40, No. 5, 2572-2600. ⟨10.1214/12-AOS1042⟩
- Accession number :
- edsair.doi.dedup.....b9036423b8014c95fd837daa6c30c098