14 results on '"Collier, Olivier"'
Search Results
2. MINIMAX ESTIMATION OF LINEAR AND QUADRATIC FUNCTIONALS ON SPARSITY CLASSES
- Author
-
Collier, Olivier, Comminges, Laëtitia, and Tsybakov, Alexandre B.
- Published
- 2017
3. A Network of 17 Microtubule-Related Genes Highlights Functional Deregulations in Breast Cancer.
- Author
-
Rodrigues-Ferreira, Sylvie, Morin, Morgane, Guichaoua, Gwenn, Moindjie, Hadia, Haykal, Maria M., Collier, Olivier, Stoven, Véronique, and Nahmias, Clara
- Subjects
SURVIVAL ,PROTEIN kinases ,KINESIN ,CELL physiology ,CELL survival ,GENES ,RESEARCH funding ,TUMOR markers ,BREAST tumors ,CYTOPLASM - Abstract
Simple Summary: The microtubule cytoskeleton is a key component of the cell and an important target for breast cancer therapy. Microtubule organization and function are tightly regulated by a panel of microtubule-related proteins (MT-Rel) to ensure cellular homeostasis. Deregulation of MT-Rel genes is likely to impact microtubule dynamics and subsequent cell functions. In this study, we evaluate the prognostic value of a panel of 17 MT-Rel genes in breast tumors and the functional consequence of their deregulation using a Systems Biology approach. This study highlights MT-Rel as potential prognostic biomarkers and interesting therapeutical targets to evaluate in breast cancer. A wide panel of microtubule-associated proteins and kinases is involved in coordinated regulation of the microtubule cytoskeleton and may thus represent valuable molecular markers contributing to major cellular pathways deregulated in cancer. We previously identified a panel of 17 microtubule-related (MT-Rel) genes that are differentially expressed in breast tumors showing resistance to taxane-based chemotherapy. In the present study, we evaluated the expression, prognostic value and functional impact of these genes in breast cancer. We show that 14 MT-Rel genes (KIF4A, ASPM, KIF20A, KIF14, TPX2, KIF18B, KIFC1, AURKB, KIF2C, GTSE1, KIF15, KIF11, RACGAP1, STMN1) are up-regulated in breast tumors compared with adjacent normal tissue. Six of them (KIF4A, ASPM, KIF20A, KIF14, TPX2, KIF18B) are overexpressed by more than 10-fold in tumor samples and four of them (KIF11, AURKB, TPX2 and KIFC1) are essential for cell survival. Overexpression of all 14 genes, and underexpression of 3 other MT-Rel genes (MAST4, MAPT and MTUS1) are associated with poor breast cancer patient survival. A Systems Biology approach highlighted three major functional networks connecting the 17 MT-Rel genes and their partners, which are centered on spindle assembly, chromosome segregation and cytokinesis. Our studies identified mitotic Aurora kinases and their substrates as major targets for therapeutic approaches against breast cancer. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. Estimating linear functionals of a sparse family of Poisson means
- Author
-
Collier, Olivier and Dalalyan, Arnak S.
- Published
- 2018
- Full Text
- View/download PDF
5. Curve registration by nonparametric goodness-of-fit testing
- Author
-
Collier, Olivier and Dalalyan, Arnak S.
- Published
- 2015
- Full Text
- View/download PDF
6. LOTUS: a single-and multi-task machine-learning algorithm for the prediction of cancer driver genes
- Author
-
Collier, Olivier, Stoven, Véronique, Vert, Jean-Philippe, Modélisation aléatoire de Paris X (MODAL'X), Université Paris Nanterre (UPN), Centre de Bioinformatique (CBIO), Mines Paris - PSL (École nationale supérieure des mines de Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Institut Curie [Paris], Cancer et génome: Bioinformatique, biostatistiques et épidémiologie d'un système complexe, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut Curie [Paris]-Institut National de la Santé et de la Recherche Médicale (INSERM), ANR-10-LABX-0023,UnivEarthS,Earth - Planets - Universe: observation, modeling, transfer(2010), MINES ParisTech - École nationale supérieure des mines de Paris, Institut Curie [Paris]-MINES ParisTech - École nationale supérieure des mines de Paris, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de la Santé et de la Recherche Médicale (INSERM), Collier, Olivier, and Earth - Planets - Universe: observation, modeling, transfer - - UnivEarthS2010 - ANR-10-LABX-0023 - LABX - VALID
- Subjects
[STAT.AP]Statistics [stat]/Applications [stat.AP] ,[STAT.AP] Statistics [stat]/Applications [stat.AP] ,[SDV.GEN.GH]Life Sciences [q-bio]/Genetics/Human genetics ,[SDV.GEN.GH] Life Sciences [q-bio]/Genetics/Human genetics - Abstract
Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types. In this paper we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including informations about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a 1 multitask learning strategy to share information across cancer types. We empirically show that LOTUS outperforms three other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types. Author summary Cancer development is thought to be driven by some important genes that should be targeted by new treatments. Unfortunately, there is a small number of such genes, so that it is of crucial importance to design algorithms capable of finding genes with the highest oncogenic potential. Our new method analyses in particular data of mutations but also other sources of informations to establish a list of genes that should be investigated in priority. Moreover, our algorithm can differentiate between several types of cancer and share information between them to improve the prediction for every disease. We showed that in several contexts our algorithm beats its concurrents.
- Published
- 2018
7. Estimating linear functionals of a sparse family of Poisson means
- Author
-
Collier, Olivier, Dalalyan, Arnak, Université Paris Nanterre (UPN), Modélisation aléatoire de Paris X (MODAL'X), ANR-11-LABX-0023,MME-DII,Modèles Mathématiques et Economiques de la Dynamique, de l'Incertitude et des Interactions(2011), Collier, Olivier, and Modèles Mathématiques et Economiques de la Dynamique, de l'Incertitude et des Interactions - - MME-DII2011 - ANR-11-LABX-0023 - LABX - VALID
- Subjects
[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Poisson processes ,Nonasymptotic minimax estimation ,FOS: Mathematics ,thresholding ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,linear functional ,[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST] ,group-sparsity - Abstract
International audience; Assume that we observe a sample of size n composed of p-dimensional signals, each signal having independent entries drawn from a scaled Poisson distribution with an unknown intensity. We are interested in estimating the sum of the n unknown intensity vectors, under the assumption that most of them coincide with a given " background " signal. The number s of p-dimensional signals different from the background signal plays the role of sparsity and the goal is to leverage this sparsity assumption in order to improve the quality of estimation as compared to the naive estimator that computes the sum of the observed signals. We first introduce the group hard thresholding estimator and analyze its mean squared error measured by the squared Euclidean norm. We establish a nonasymptotic upper bound showing that the risk is at most of the order of σ 2 (sp + s 2 √ p) log 3/2 (np). We then establish lower bounds on the minimax risk over a properly defined class of collections of s-sparse signals. These lower bounds match with the upper bound, up to logarithmic terms, when the dimension p is fixed or of larger order than s 2. In the case where the dimension p increases but remains of smaller order than s 2 , our results show a gap between the lower and the upper bounds, which can be up to order √ p. MSC 2010 subject classifications: Primary 62J05; secondary 62G05.
- Published
- 2017
8. Minimax optimal estimators for general additive functional estimation
- Author
-
Collier, Olivier, Comminges, Laëtitia, Université Paris Nanterre (UPN), Modélisation aléatoire de Paris X (MODAL'X), Centre de Recherche en Économie et Statistique (CREST), Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] (ENSAI)-École polytechnique (X)-École Nationale de la Statistique et de l'Administration Économique (ENSAE Paris)-Centre National de la Recherche Scientifique (CNRS), CEntre de REcherches en MAthématiques de la DEcision (CEREMADE), Centre National de la Recherche Scientifique (CNRS)-Université Paris Dauphine-PSL, Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Université Paris Dauphine-PSL, and Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
sparsity ,FOS: Mathematics ,polynomial approximation ,Minimax estimation ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,additive functional - Abstract
In this paper, we observe a sparse mean vector through Gaussian noise and we aim at estimating some additive functional of the mean in the minimax sense. More precisely, we generalize the results of (Collier et al., 2017, 2019) to a very large class of functionals. The optimal minimax rate is shown to depend on the polynomial approximation rate of the marginal functional, and optimal estimators achieving this rate are built.
- Published
- 2019
9. Permutation estimation and minimax rates of identifiability
- Author
-
Collier, Olivier, Dalalyan, Arnak S., imagine [Marne-la-Vallée], Laboratoire d'Informatique Gaspard-Monge (LIGM), Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)-Centre Scientifique et Technique du Bâtiment (CSTB), Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS), Centre de Recherche en Économie et Statistique (CREST), Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] (ENSAI)-École polytechnique (X)-École Nationale de la Statistique et de l'Administration Économique (ENSAE Paris)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre Scientifique et Technique du Bâtiment (CSTB), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM), and Collier, Olivier
- Subjects
minimax ,[STAT.TH] Statistics [stat]/Statistics Theory [stat.TH] ,Gaussian sequence model ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,permutation estimation ,[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST] - Abstract
International audience; The problem of matching two sets of features appears in various tasks of computer vision and can be often formalized as a problem of permutation estimation. We address this problem from a statistical point of view and provide a theoretical analysis of the accuracy of several natural estimators. To this end, the notion of the minimax rate of identifiability is introduced and its expression is obtained as a function of the sample size, noise level and dimensionality. We consider the cases of homoscedastic and heteroscedastic noise and carry out, in each case, upper bounds on the identifiability threshold of several estimators. This upper bounds are shown to be unimprovable in the homoscedastic setting. We also discuss the computational aspects of the estimators and provide empirical evidence of their consistency on synthetic data.
- Published
- 2013
10. LOTUS: A single- and multitask machine learning algorithm for the prediction of cancer driver genes.
- Author
-
Collier, Olivier, Stoven, Véronique, and Vert, Jean-Philippe
- Subjects
- *
CANCER genes , *MACHINE learning , *LEARNING strategies , *P53 antioncogene , *PROTEIN-protein interactions , *COMPUTATIONAL biology , *TUMOR suppressor genes - Abstract
Cancer driver genes, i.e., oncogenes and tumor suppressor genes, are involved in the acquisition of important functions in tumors, providing a selective growth advantage, allowing uncontrolled proliferation and avoiding apoptosis. It is therefore important to identify these driver genes, both for the fundamental understanding of cancer and to help finding new therapeutic targets or biomarkers. Although the most frequently mutated driver genes have been identified, it is believed that many more remain to be discovered, particularly for driver genes specific to some cancer types. In this paper, we propose a new computational method called LOTUS to predict new driver genes. LOTUS is a machine-learning based approach which allows to integrate various types of data in a versatile manner, including information about gene mutations and protein-protein interactions. In addition, LOTUS can predict cancer driver genes in a pan-cancer setting as well as for specific cancer types, using a multitask learning strategy to share information across cancer types. We empirically show that LOTUS outperforms five other state-of-the-art driver gene prediction methods, both in terms of intrinsic consistency and prediction accuracy, and provide predictions of new cancer genes across many cancer types. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
11. Minimax rates in permutation estimation for feature matching
- Author
-
Collier, Olivier, Dalalyan, Arnak S., imagine [Marne-la-Vallée], Laboratoire d'Informatique Gaspard-Monge (LIGM), Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)-Centre Scientifique et Technique du Bâtiment (CSTB), Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS), Centre de Recherche en Économie et Statistique (CREST), Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] (ENSAI)-École polytechnique (X)-École Nationale de la Statistique et de l'Administration Économique (ENSAE Paris)-Centre National de la Recherche Scientifique (CNRS), LABEX ECODEC, Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre Scientifique et Technique du Bâtiment (CSTB), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM), and Dalalyan, Arnak
- Subjects
FOS: Computer and information sciences ,Computer Science - Learning ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[STAT.TH] Statistics [stat]/Statistics Theory [stat.TH] ,FOS: Mathematics ,Mathematics - Statistics Theory ,[INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG] ,Statistics Theory (math.ST) ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,Machine Learning (cs.LG) - Abstract
The problem of matching two sets of features appears in various tasks of computer vision and can be often formalized as a problem of permutation estimation. We address this problem from a statistical point of view and provide a theoretical analysis of the accuracy of several natural estimators. To this end, the minimax rate of separation is investigated and its expression is obtained as a function of the sample size, noise level and dimension. We consider the cases of homoscedastic and heteroscedastic noise and establish, in each case, tight upper bounds on the separation distance of several estimators. These upper bounds are shown to be unimprovable both in the homoscedastic and heteroscedastic settings. Interestingly, these bounds demonstrate that a phase transition occurs when the dimension $d$ of the features is of the order of the logarithm of the number of features $n$. For $d=O(\log n)$, the rate is dimension free and equals $\sigma (\log n)^{1/2}$, where $\sigma$ is the noise level. In contrast, when $d$ is larger than $c\log n$ for some constant $c>0$, the minimax rate increases with $d$ and is of the order $\sigma(d\log n)^{1/4}$. We also discuss the computational aspects of the estimators and provide empirical evidence of their consistency on synthetic data. Finally, we show that our results extend to more general matching criteria.
- Published
- 2013
12. Statistical methods for descriptor matching
- Author
-
Collier, Olivier, Laboratoire d'Informatique Gaspard-Monge (LIGM), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM), imagine [Marne-la-Vallée], Centre Scientifique et Technique du Bâtiment (CSTB)-École des Ponts ParisTech (ENPC)-Laboratoire d'Informatique Gaspard-Monge (LIGM), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-Université Paris-Est Marne-la-Vallée (UPEM), Université Paris-Est, Arnak S. Dalalyan, Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre Scientifique et Technique du Bâtiment (CSTB), and STAR, ABES
- Subjects
Minimax testing ,[MATH.MATH-GM]Mathematics [math]/General Mathematics [math.GM] ,Problème non-paramétrique ,Minimax estimation ,Tests minimax ,[MATH.MATH-GM] Mathematics [math]/General Mathematics [math.GM] ,Nonparametric problem ,Estimation minimax - Abstract
Many applications, as in computer vision or medicine, aim at identifying the similarities between several images or signals. There after, it is possible to detect objects, to follow them, or to overlap different pictures. In every case, the algorithmic procedures that treat the images use a selection of key points that they try to match by pairs. The most popular algorithm nowadays is SIFT, that performs key point selection, descriptor calculation, and provides a criterion for global descriptor matching. In the first part, we aim at improving this procedure by changing the original descriptor, that requires to find the argument of the maximum of a histogram: its computation is indeed statistically unstable. So we also have to change the criterion to match two descriptors. This yields a nonparametric hypothesis testing problem, in which both the null and the alternative hypotheses are composite, even nonparametric. We use the generalized likelihood ratio test to get consistent testing procedures, and carry out a minimax study. In the second part, we are interested in the optimality of the procedure of global matching. We give a statistical model in which some descriptors are present in a given order in a first image, and in another order in a second image. Descriptor matching is equivalent in this case to the estimation of a permutation. We give an optimality criterion for the estimators in the minimax sense. In particular, we use the likelihood to find several consistent estimators, which are even optimal under some conditions. Finally, we tackled some practical aspects and showed that our estimators are computable in reasonable time, so that we could then illustrate the hierarchy of our estimators by some simulations, De nombreuses applications, en vision par ordinateur ou en médecine notamment,ont pour but d'identifier des similarités entre plusieurs images ou signaux. On peut alors détecter des objets, les suivre, ou recouper des prises de vue. Dans tous les cas, les procédures algorithmiques qui traitent les images utilisent une sélection de points-clefs qu'elles essayent ensuite de mettre en correspondance par paire. Elles calculent pour chaque point un descripteur qui le caractérise, le discrimine des autres. Parmi toutes les procédures possibles,la plus utilisée aujourd'hui est SIFT, qui sélectionne les points-clefs, calcule des descripteurs et propose un critère de mise en correspondance globale. Dans une première partie, nous tentons d'améliorer cet algorithme en changeant le descripteur original qui nécessite de trouver l'argument du maximum d'un histogramme : en effet, son calcul est statistiquement instable. Nous devons alors également changer le critère de mise en correspondance de deux descripteurs. Il en résulte un problème de test non paramétrique dans lequel à la fois l'hypothèse nulle et alternative sont composites, et même non paramétriques. Nous utilisons le test du rapport de vraisemblance généralisé afin d'exhiber des procédures de test consistantes, et proposons une étude minimax du problème. Dans une seconde partie, nous nous intéressons à l'optimalité d'une procédure globale de mise en correspondance. Nous énonçons un modèle statistique dans lequel des descripteurs sont présents dans un certain ordre dans une première image, et dans un autre dans une seconde image. La mise en correspondance revient alors à l'estimation d'une permutation. Nous donnons un critère d'optimalité au sens minimax pour les estimateurs. Nous utilisons en particulier la vraisemblance afin de trouver plusieurs estimateurs consistants, et même optimaux sous certaines conditions. Enfin, nous nous sommes intéressés à des aspects pratiques en montrant que nos estimateurs étaient calculables en temps raisonnable, ce qui nous a permis ensuite d'illustrer la hiérarchie de nos estimateurs par des simulations
- Published
- 2013
13. Wilks' phenomenon and penalized likelihood-ratio test for nonparametric curve registration
- Author
-
Collier, Olivier, Dalalyan, Arnak S., imagine [Marne-la-Vallée], Laboratoire d'Informatique Gaspard-Monge (LIGM), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM)-Centre Scientifique et Technique du Bâtiment (CSTB), Centre National de la Recherche Scientifique (CNRS)-Fédération de Recherche Bézout-ESIEE Paris-École des Ponts ParisTech (ENPC)-Université Paris-Est Marne-la-Vallée (UPEM), Centre de Recherche en Économie et Statistique (CREST), Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] (ENSAI)-École polytechnique (X)-École Nationale de la Statistique et de l'Administration Économique (ENSAE Paris)-Centre National de la Recherche Scientifique (CNRS), Dalalyan, Arnak, Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)-Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)-Centre Scientifique et Technique du Bâtiment (CSTB), and Université Paris-Est Marne-la-Vallée (UPEM)-École des Ponts ParisTech (ENPC)-ESIEE Paris-Fédération de Recherche Bézout-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[STAT.ML]Statistics [stat]/Machine Learning [stat.ML] ,[STAT.ML] Statistics [stat]/Machine Learning [stat.ML] - Abstract
International audience; The problem of curve registration appears in many different areas of applications ranging from neuroscience to road traffic modeling. In the present work, we propose a nonparametric testing framework in which we develop a generalized likelihood ratio test to perform curve registration. We first prove that, under the null hypothesis, the resulting test statistic is asymptotically distributed as a chi-squared random variable. This result, often referred to as Wilks' phenomenon, provides a natural threshold for the test of a prescribed asymptotic significance level and a natural measure of lack-of-fit in terms of the p-value of the $\chi^2$-test. We also prove that the proposed test is consistent, i.e., its power is asymptotically equal to 1. Finite sample properties of the proposed methodology are demonstrated by numerical simulations.
- Published
- 2012
14. Minimax Rates in Permutation Estimation for Feature Matching.
- Author
-
Collier, Olivier and Dalalyan, Arnak S.
- Subjects
- *
COMPUTER algorithms , *MACHINE learning , *MACHINE theory , *DATA mining , *ARTIFICIAL intelligence , *COMPUTER software - Abstract
The problem of matching two sets of features appears in various tasks of computer vision and can be often formalized as a problem of permutation estimation. We address this problem from a statistical point of view and provide a theoretical analysis of the accuracy of several natural estimators. To this end, the minimax rate of separation is investigated and its expression is obtained as a function of the sample size, noise level and dimension of the features. We consider the cases of homoscedastic and heteroscedastic noise and establish, in each case, tight upper bounds on the separation distance of several estimators. These upper bounds are shown to be unimprovable both in the homoscedastic and heteroscedastic settings. Interestingly, these bounds demonstrate that a phase transition occurs when the dimension d of the features is of the order of the logarithm of the number of features n. For d = O(log n), the rate is dimension free and equals σ(log n)1/2, where σ is the noise level. In contrast, when d is larger than c log n for some constant c > 0, the minimax rate increases with d and is of the order of σ(d log n)1/4. We also discuss the computational aspects of the estimators and provide empirical evidence of their consistency on synthetic data. Finally, we show that our results extend to more general matching criteria. [ABSTRACT FROM AUTHOR]
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.