40 results on '"Taupin , Marie-Luce"'
Search Results
2. Risk upper bounds for RKHS ridge group sparse estimator in the regression model with non-Gaussian and non-bounded error
- Author
-
Kamari, Halaleh, Huet, Sylvie, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,Statistics - Other Statistics - Abstract
We consider the problem of estimating a meta-model of an unknown regression model with non-Gaussian and non-bounded error. The meta-model belongs to a reproducing kernel Hilbert space constructed as a direct sum of Hilbert spaces leading to an additive decomposition including the variables and interactions between them. The estimator of this meta-model is calculated by minimizing an empirical least-squares criterion penalized by the sum of the Hilbert norm and the empirical $L^2$-norm. In this context, the upper bounds of the empirical $L^2$ risk and the $L^2$ risk of the estimator are established., Comment: Previously this appeared as arXiv:1905.13695v3 which was submitted as a replacement by accident. arXiv admin note: text overlap with arXiv:1701.04671
- Published
- 2020
3. RKHSMetaMod: An R package to estimate the Hoeffding decomposition of a complex model by solving RKHS ridge group sparse optimization problem
- Author
-
Kamari, Halaleh, Huet, Sylvie, and Taupin, Marie-Luce
- Subjects
Statistics - Machine Learning ,Computer Science - Machine Learning ,Statistics - Computation - Abstract
In this paper, we propose an R package, called RKHSMetaMod, that implements a procedure for estimating a meta-model of a complex model. The meta-model approximates the Hoeffding decomposition of the complex model and allows us to perform sensitivity analysis on it. It belongs to a reproducing kernel Hilbert space that is constructed as a direct sum of Hilbert spaces. The estimator of the meta-model is the solution of a penalized empirical least-squares minimization with the sum of the Hilbert norm and the empirical L^2-norm. This procedure, called RKHS ridge group sparse, allows both to select and estimate the terms in the Hoeffding decomposition, and therefore, to select and estimate the Sobol indices that are non-zero. The RKHSMetaMod package provides an interface from R statistical computing environment to the C++ libraries Eigen and GSL. In order to speed up the execution time and optimize the storage memory, except for a function that is written in R, all of the functions of this package are written using the efficient C++ libraries through RcppEigen and RcppGSL packages. These functions are then interfaced in the R environment in order to propose a user-friendly package., Comment: arXiv admin note: text overlap with arXiv:1701.04671
- Published
- 2019
4. Accelerating metabolic models evaluation with statistical metamodels: application to Salmonella infection models
- Author
-
Frioux Clémence, Huet Sylvie, Labarthe Simon, Martinelli Julien, Malou Thibault, Sherman David, Taupin Marie-Luce, and Ugalde-Salas Pablo
- Subjects
Applied mathematics. Quantitative methods ,T57-57.97 ,Mathematics ,QA1-939 - Abstract
Mathematical and numerical models are increasingly used in microbial ecology to model the fate of microbial communities in their ecosystem. These models allow to connect in a mechanistic framework species-level informations, such as the microbial genomes, with macro-scale features, such as species spatial distributions or metabolite gradients. Numerous models are built upon species-level metabolic models that predict the metabolic behaviour of a microbe by solving an optimization problem knowing its genome and its nutritional environment. However, screening the community dynamics with these metabolic models implies to solve such an optimization problem by species at each time step, leading to a significant computational load further increased by several orders of magnitude when spatial dimensions are added. In this paper, we propose a statistical framework based on Reproducing Kernel Hilbert Space (RKHS) metamodels that are used to provide fast approximations of the original metabolic model. The metamodel can replace the optimization step in the system dynamics, providing comparable outputs at a much lower computational cost. We will first build a system dynamics model of a simplified gut microbiota composed of a unique commensal bacterial strain in interaction with the host and challenged by a Salmonella infection. Then, the machine learning method will be introduced, and particularly the ANOVA-RKHS that will be exploited to achieve variable selection and model parsimony. A training dataset will be constructed with the original system dynamics model and hyper-parameters will be carefully chosen to provide fast and accurate approximations of the original model. Finally, the accuracy of the trained metamodels will be assessed, in particular by comparing the system dynamics outputs when the original model is replaced by its metamodel. The metamodel allows an overall relative error of 4.71% but reducing the computational load by a speed-up factor higher than 45, while correctly reproducing the complex behaviour occurring during Salmonella infection. These results provide a proof-of-concept of the potentiality of machine learning methods to give fast approximations of metabolic model outputs and pave the way towards PDE-based spatio-temporal models of microbial communities including microbial metabolism and host-microbiota-pathogen interactions.
- Published
- 2023
- Full Text
- View/download PDF
5. Sensitivity analysis of spatio-temporal models describing nitrogen transfers, transformations and losses at the landscape scale
- Author
-
Savall, Jordi Ferrer, Franqueville, Damien, Barbillon, Pierre, Benhamou, Cyril, Durand, Patrick, Taupin, Marie-Luce, Monod, Hervé, and Drouet, Jean-Louis
- Subjects
Statistics - Applications - Abstract
Modelling complex systems such as agroecosystems often requires the quantification of a large number of input factors. Sensitivity analyses are useful to determine the appropriate spatial and temporal resolution of models and to reduce the number of factors to be measured or estimated accurately. Comprehensive spatial and temporal sensitivity analyses were applied to the NitroScape model, a deterministic spatially distributed model describing nitrogen transfers and transformations in rural landscapes. Simulations were led on a theoretical landscape that represented five years of intensive farm management and covering an area of $3\, km^2$. Cluster analyses were applied to summarize the results of the sensitivity analysis on the ensemble of model outputs. The methodology we applied is useful to synthesize sensitivity analyses of models with multiple space-time input and output variables and could be ported to other models than NitroScape.
- Published
- 2017
6. Metamodel construction for sensitivity analysis
- Author
-
Huet, Sylvie and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory - Abstract
We propose to estimate a metamodel and the sensitivity indices of a complex model m in the Gaussian regression framework. Our approach combines methods for sensitivity analysis of complex models and statistical tools for sparse non-parametric estimation in multivariate Gaussian regression model. It rests on the construction of a metamodel for aproximating the Hoeffding-Sobol decomposition of m. This metamodel belongs to a reproducing kernel Hilbert space constructed as a direct sum of Hilbert spaces leading to a functional ANOVA decomposition. The estimation of the metamodel is carried out via a penalized least-squares minimization allowing to select the subsets of variables that contribute to predict the output. It allows to estimate the sensitivity indices of m. We establish an oracle-type inequality for the risk of the estimator, describe the procedure for estimating the metamodel and the sensitivity indices, and assess the performances of the procedure via a simulation study.
- Published
- 2017
7. Model selection in logistic regression
- Author
-
Kwemou, Marius, Taupin, Marie-Luce, and Tocquet, Anne-Sophie
- Subjects
Mathematics - Statistics Theory - Abstract
This paper is devoted to model selection in logistic regression. We extend the model selection principle introduced by Birg\'e and Massart (2001) to logistic regression model. This selection is done by using penalized maximum likelihood criteria. We propose in this context a completely data-driven criteria based on the slope heuristics. We prove non asymptotic oracle inequalities for selected estimators. Theoretical results are illustrated through simulation studies.
- Published
- 2015
8. Adaptive kernel estimation of the baseline function in the Cox model, with high-dimensional covariates
- Author
-
Guilloux, Agathe, Lemler, Sarah, and Taupin, Marie-Luce
- Subjects
Statistics - Applications ,Statistics - Methodology - Abstract
The aim of this article is to propose a novel kernel estimator of the baseline function in a general high-dimensional Cox model, for which we derive non-asymptotic rates of convergence. To construct our estimator, we first estimate the regression parameter in the Cox model via a Lasso procedure. We then plug this estimator into the classical kernel estimator of the baseline function, obtained by smoothing the so-called Breslow estimator of the cumulative baseline function. We propose and study an adaptive procedure for selecting the bandwidth, in the spirit of Gold-enshluger and Lepski (2011). We state non-asymptotic oracle inequalities for the final estimator, which reveal the reduction of the rates of convergence when the dimension of the covariates grows.
- Published
- 2015
9. Adaptive estimation of the baseline hazard function in the Cox model by model selection, with high-dimensional covariates
- Author
-
Guilloux, Agathe, Lemler, Sarah, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,Statistics - Applications - Abstract
The purpose of this article is to provide an adaptive estimator of the baseline function in the Cox model with high-dimensional covariates. We consider a two-step procedure : first, we estimate the regression parameter of the Cox model via a Lasso procedure based on the partial log-likelihood, secondly, we plug this Lasso estimator into a least-squares type criterion and then perform a model selection procedure to obtain an adaptive penalized contrast estimator of the baseline function. Using non-asymptotic estimation results stated for the Lasso estimator of the regression parameter, we establish a non-asymptotic oracle inequality for this penalized contrast estimator of the baseline function, which highlights the discrepancy of the rate of convergence when the dimension of the covariates increases.
- Published
- 2015
10. Sensitivity analysis of spatio-temporal models describing nitrogen transfers, transformations and losses at the landscape scale
- Author
-
Ferrer Savall, Jordi, Franqueville, Damien, Barbillon, Pierre, Benhamou, Cyril, Durand, Patrick, Taupin, Marie-Luce, Monod, Hervé, and Drouet, Jean-Louis
- Published
- 2019
- Full Text
- View/download PDF
11. Estimation in autoregressive model with measurement error
- Author
-
Dedecker, Jérôme, Samson, Adeline, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory - Abstract
Consider an autoregressive model with measurement error: we observe $Z_i=X_i+\epsilon_i$, where $X_i$ is a stationary solution of the equation $X_i=f_{\theta^0}(X_{i-1})+\xi_i$. The regression function $f_{\theta^0}$ is known up to a finite dimensional parameter $\theta^0$. The distributions of $X_0$ and $\xi_1$ are unknown whereas the distribution of $\epsilon_1$ is completely known. We want to estimate the parameter $\theta^0$ by using the observations $Z_0,..,Z_n$. We propose an estimation procedure based on a modified least square criterion involving a weight function $w$, to be suitably chosen. We give upper bounds for the risk of the estimator, which depend on the smoothness of the errors density $f_\epsilon$ and on the smoothness properties of $w f_\theta$.
- Published
- 2011
12. Adaptive density estimation for general ARCH models
- Author
-
Comte, Fabienne, Dedecker, Jérôme, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,62G07- 62M05 - Abstract
We consider a model $Y\_t=\sigma\_t\eta\_t$ in which $(\sigma\_t)$ is not independent of the noise process $(\eta\_t)$, but $\sigma\_t$ is independent of $\eta\_t$ for each $t$. We assume that $(\sigma\_t)$ is stationary and we propose an adaptive estimator of the density of $\ln(\sigma^2\_t)$ based on the observations $Y\_t$. Under various dependence structures, the rates of this nonparametric estimator coincide with the minimax rates obtained in the i.i.d. case when $(\sigma\_t)$ and $(\eta\_t)$ are independent, in all cases where these minimax rates are known. The results apply to various linear and non linear ARCH processes.
- Published
- 2006
13. Semi-parametric estimation of the hazard function in a model with covariate measurement error
- Author
-
Martin-Magniette, Marie-Laure and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,62G05, 62F12, 62N01, 62N02 ,62J02 - Abstract
We consider a model where the failure hazard function, conditional on a covariate $Z$ is given by $R(t,\theta^0|Z)=\eta\_{\gamma^0}(t)f\_{\beta^0}(Z)$, with $\theta^0=(\beta^0,\gamma^0)^\top\in \mathbb{R}^{m+p}$. The baseline hazard function $\eta\_{\gamma^0}$ and relative risk $f\_{\beta^0}$ belong both to parametric families. The covariate $Z$ is measured through the error model $U=Z+\epsilon$ where $\epsilon$ is independent from $Z$, with known density $f\_\epsilon$. We observe a $n$-sample $(X\_i, D\_i, U\_i)$, $i=1,...,n$, where $X\_i$ is the minimum between the failure time and the censoring time, and $D\_i$ is the censoring indicator. We aim at estimating $\theta^0$ in presence of the unknown density $g$. Our estimation procedure based on least squares criterion provide two estimators. The first one minimizes an estimation of the least squares criterion where $g$ is estimated by density deconvolution. Its rate depends on the smoothnesses of $f\_\epsilon$ and $f\_\beta(z)$ as a function of $z$,. We derive sufficient conditions that ensure the $\sqrt{n}$-consistency. The second estimator is constructed under conditions ensuring that the least squares criterion can be directly estimated with the parametric rate. These estimators, deeply studied through examples are in particular $\sqrt{n}$-consistent and asymptotically Gaussian in the Cox model and in the excess risk model, whatever is $f\_\epsilon$.
- Published
- 2006
14. Adaptive density deconvolution with dependent inputs
- Author
-
Comte, Fabienne, Dedecker, Jérôme, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,62G07-62G20 - Abstract
In the convolution model $Z\_i=X\_i+ \epsilon\_i$, we give a model selection procedure to estimate the density of the unobserved variables $(X\_i)\_{1 \leq i \leq n}$, when the sequence $(X\_i)\_{i \geq 1}$ is strictly stationary but not necessarily independent. This procedure depends on wether the density of $\epsilon\_i$ is super smooth or ordinary smooth. The rates of convergence of the penalized contrast estimators are the same as in the independent framework, and are minimax over most classes of regularity on ${\mathbb R}$. Our results apply to mixing sequences, but also to many other dependent sequences. When the errors are super smooth, the condition on the dependence coefficients is the minimal condition of that type ensuring that the sequence $(X\_i)\_{i \geq 1}$ is not a long-memory process.
- Published
- 2006
15. Finite sample penalization in adaptive density deconvolution
- Author
-
Comte, Fabienne, Rozenholc, Yves, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,Primary 62G07. Secondary 62G20 - Abstract
We consider the problem of estimating the density $g$ of identically distributed variables $X\_i$, from a sample $Z\_1, ..., Z\_n$ where $Z\_i=X\_i+\sigma\epsilon\_i$, $i=1, ..., n$ and $\sigma \epsilon\_i$ is a noise independent of $X\_i$ with known density $ \sigma^{-1}f\_\epsilon(./\sigma)$. We generalize adaptive estimators, constructed by a model selection procedure, described in Comte et al. (2005). We study numerically their properties in various contexts and we test their robustness. Comparisons are made with respect to deconvolution kernel estimators, misspecification of errors, dependency,... It appears that our estimation algorithm, based on a fast procedure, performs very well in all contexts.
- Published
- 2006
16. Penalized contrast estimator for adaptive density deconvolution
- Author
-
Comte, Fabienne, Rozenholc, Yves, and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,Primary 62G07. Secondary 62G20 - Abstract
The authors consider the problem of estimating the density $g$ of independent and identically distributed variables $X\_i$, from a sample $Z\_1, ..., Z\_n$ where $Z\_i=X\_i+\sigma\epsilon\_i$, $i=1, ..., n$, $\epsilon$ is a noise independent of $X$, with $\sigma\epsilon$ having known distribution. They present a model selection procedure allowing to construct an adaptive estimator of $g$ and to find non-asymptotic bounds for its $\mathbb{L}\_2(\mathbb{R})$-risk. The estimator achieves the minimax rate of convergence, in most cases where lowers bounds are available. A simulation study gives an illustration of the good practical performances of the method.
- Published
- 2006
17. Nonparametric Estimation of the Regression Function in an Errors-in-Variables Model
- Author
-
Comte, Fabienne and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory ,(Primary) 62G08, 62G07 ,(Secondary) 62G05, 62G20 - Abstract
We consider the regression model with errors-in-variables where we observe $n$ i.i.d. copies of $(Y,Z)$ satisfying $Y=f(X)+\xi, Z=X+\sigma\epsilon$, involving independent and unobserved random variables $X,\xi,\epsilon$. The density $g$ of $X$ is unknown, whereas the density of $\sigma\epsilon$ is completely known. Using the observations $(Y\_i, Z\_i)$, $i=1,...,n$, we propose an estimator of the regression function $f$, built as the ratio of two penalized minimum contrast estimators of $\ell=fg$ and $g$, without any prior knowledge on their smoothness. We prove that its $\mathbb{L}\_2$-risk on a compact set is bounded by the sum of the two $\mathbb{L}\_2(\mathbb{R})$-risks of the estimators of $\ell$ and $g$, and give the rate of convergence of such estimators for various smoothness classes for $\ell$ and $g$, when the errors $\epsilon$ are either ordinary smooth or super smooth. The resulting rate is optimal in a minimax sense in all cases where lower bounds are available.
- Published
- 2005
18. New $M$-estimators in semi-parametric regression with errors in variables
- Author
-
Butucea, Cristina and Taupin, Marie-Luce
- Subjects
Mathematics - Statistics Theory - Abstract
In the regression model with errors in variables, we observe $n$ i.i.d. copies of $(Y,Z)$ satisfying $Y=f_{\theta^0}(X)+\xi$ and $Z=X+\epsilon$ involving independent and unobserved random variables $X,\xi,\epsilon$ plus a regression function $f_{\theta^0}$, known up to a finite dimensional $\theta^0$. The common densities of the $X_i$'s and of the $\xi_i$'s are unknown, whereas the distribution of $\epsilon$ is completely known. We aim at estimating the parameter $\theta^0$ by using the observations $(Y_1,Z_1),...,(Y_n,Z_n)$. We propose an estimation procedure based on the least square criterion $\tilde{S}_{\theta^0,g}(\theta)=\m athbb{E}_{\theta^0,g}[((Y-f_{\theta}(X))^2w(X)]$ where $w$ is a weight function to be chosen. We propose an estimator and derive an upper bound for its risk that depends on the smoothness of the errors density $p_{\epsilon}$ and on the smoothness properties of $w(x)f_{\theta}(x)$. Furthermore, we give sufficient conditions that ensure that the parametric rate of convergence is achieved. We provide practical recipes for the choice of $w$ in the case of nonlinear regression functions which are smooth on pieces allowing to gain in the order of the rate of convergence, up to the parametric rate in some cases. We also consider extensions of the estimation procedure, in particular, when a choice of $w_{\theta}$ depending on $\theta$ would be more appropriate., Comment: Published in at http://dx.doi.org/10.1214/07-AIHP107 the Annales de l'Institut Henri Poincar\'e - Probabilit\'es et Statistiques (http://www.imstat.org/aihp/) by the Institute of Mathematical Statistics (http://www.imstat.org)
- Published
- 2005
- Full Text
- View/download PDF
19. Semi-Parametric Estimation in the Nonlinear Structural Errors-in-Variables Model
- Author
-
Taupin, Marie-Luce
- Published
- 2001
20. Metamodel construction for sensitivity analysis
- Author
-
Huet Sylvie and Taupin Marie-Luce
- Subjects
Applied mathematics. Quantitative methods ,T57-57.97 ,Mathematics ,QA1-939 - Abstract
We propose to estimate a metamodel and the sensitivity indices of a complex model m in the Gaussian regression framework. Our approach combines methods for sensitivity analysis of complex models and statistical tools for sparse non-parametric estimation in multivariate Gaussian regression model. It rests on the construction of a metamodel for aproximating the Hoeffding-Sobol decomposition of m. This metamodel belongs to a reproducing kernel Hilbert space constructed as a direct sum of Hilbert spaces leading to a functional ANOVA decomposition. The estimation of the metamodel is carried out via a penalized least-squares minimization allowing to select the subsets of variables that contribute to predict the output. It allows to estimate the sensitivity indices of m. We establish an oracle-type inequality for the risk of the estimator, describe the procedure for estimating the metamodel and the sensitivity indices, and assess the performances of the procedure via a simulation study.
- Published
- 2017
- Full Text
- View/download PDF
21. Adaptive kernel estimation of the baseline function in the Cox model with high-dimensional covariates
- Author
-
Guilloux, Agathe, Lemler, Sarah, and Taupin, Marie-Luce
- Published
- 2016
- Full Text
- View/download PDF
22. RKHSMetaMod: An R Package to Estimate the Hoeffding Decomposition of a Complex Model by Solving RKHS Ridge Group Sparse Optimization Problem
- Author
-
Kamari, Halaleh, Huet, Sylvie, and Taupin, Marie-Luce
- Subjects
FOS: Computer and information sciences ,Statistics and Probability ,Computer Science - Machine Learning ,Numerical Analysis ,Statistics - Machine Learning ,Machine Learning (stat.ML) ,Statistics, Probability and Uncertainty ,Statistics - Computation ,Computation (stat.CO) ,Machine Learning (cs.LG) - Abstract
In this paper, we propose an R package, called RKHSMetaMod, that implements a procedure for estimating a meta-model of a complex model. The meta-model approximates the Hoeffding decomposition of the complex model and allows us to perform sensitivity analysis on it. It belongs to a reproducing kernel Hilbert space that is constructed as a direct sum of Hilbert spaces. The estimator of the meta-model is the solution of a penalized empirical least-squares minimization with the sum of the Hilbert norm and the empirical L^2-norm. This procedure, called RKHS ridge group sparse, allows both to select and estimate the terms in the Hoeffding decomposition, and therefore, to select and estimate the Sobol indices that are non-zero. The RKHSMetaMod package provides an interface from R statistical computing environment to the C++ libraries Eigen and GSL. In order to speed up the execution time and optimize the storage memory, except for a function that is written in R, all of the functions of this package are written using the efficient C++ libraries through RcppEigen and RcppGSL packages. These functions are then interfaced in the R environment in order to propose a user-friendly package., Comment: arXiv admin note: text overlap with arXiv:1701.04671
- Published
- 2022
- Full Text
- View/download PDF
23. Accelerating metabolic models evaluation with statistical metamodels: application to Salmonella infection models
- Author
-
Frioux, Clémence, Huet, Sylvie, Labarthe, Simon, Martinelli, Julien, Malou, Thibault, Sherman, David James, Taupin, Marie-Luce, Ugalde-Salas, Pablo, from patterns to models in computational biodiversity and biotechnology (PLEIADE), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Université de Bordeaux (UB)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Centre National de la Recherche Scientifique (CNRS)-Inria Bordeaux - Sud-Ouest, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Biodiversité, Gènes & Communautés (BioGeCo), Université de Bordeaux (UB)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Mathématiques et Informatique Appliquées du Génome à l'Environnement [Jouy-En-Josas] (MaIAGE), Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Biodiversité, Gènes & Communautés (BioGeCo), Université de Bordeaux (UB)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), Centre de Bioinformatique (CBIO), Mines Paris - PSL (École nationale supérieure des mines de Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), Computational systems biology and optimization (Lifeware), Inria Saclay - Ile de France, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Institut de Mathématiques de Toulouse UMR5219 (IMT), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), Université d'Évry-Val-d'Essonne (UEVE)-ENSIIE-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement (INRAE), MINES ParisTech - École nationale supérieure des mines de Paris, Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), and Institut National des Sciences Appliquées (INSA)
- Subjects
[INFO.INFO-MO]Computer Science [cs]/Modeling and Simulation - Abstract
Mathematical and numerical models are increasingly used in microbial ecology to model the fate of microbial communities in their ecosystem. These models allow to connect in a mechanistic framework species-level informations, such as the microbial genomes, with macro-scale features, such as species spatial distributions or metabolite gradients. Numerous models are built upon specieslevel metabolic models that predict the metabolic behaviour of a microbe by solving an optimization problem knowing its genome and its nutritional environment. However, screening the community dynamics with these metabolic models implies to solve such an optimization problem by species at each time step, leading to a significant computational load further increased by several orders of magnitude when spatial dimensions are added. In this paper, we propose a statistical framework based on Reproducing Kernel Hilbert Space (RKHS) metamodels that are used to provide fast approximations of the original metabolic model. The metamodel can replace the optimization step in the system dynamics, providing comparable outputs at a much lower computational cost. We will first build a system dynamics model of a simplified gut microbiota composed of a unique commensal bacterial strain in interaction with the host and challenged by a Salmonella infection. Then, the machine learning method will be introduced, and particularly the ANOVA-RKHS that will be exploited to achieve variable selection and model parsimony. A training dataset will be constructed with the original system dynamics model and hyper-parameters will be carefully chosen to provide fast and accurate approximations of the original model. Finally, the accuracy of the trained metamodels will be assessed, in particular by comparing the system dynamics outputs when the original model is replaced by its metamodel. The metamodel allows an overall relative error of 4.71% but reducing the computational load by a speed-up factor higher than 45, while correctly reproducing the complex behaviour occurring during Salmonella infection. These results provide a proof-of-concept of the potentiality of machine learning methods to give fast approximations of metabolic model outputs and pave the way towards PDEbased spatio-temporal models of microbial communities and host-microbiota-pathogen interactions.
- Published
- 2022
24. ADAPTIVE ESTIMATION IN A NONPARAMETRIC REGRESSION MODEL WITH ERRORS-IN-VARIABLES
- Author
-
Comte, Fabienne and Taupin, Marie-Luce
- Published
- 2007
25. RKHSMetaMod: An R package to estimate the Hoeffding decomposition of an unknown function by solving RKHS ridge group sparse optimization problem
- Author
-
Kamari, Halaleh, Huet, Sylvie, Taupin, Marie-Luce, Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), ENSIIE-Université d'Évry-Val-d'Essonne (UEVE)-Institut National de la Recherche Agronomique (INRA)-Centre National de la Recherche Scientifique (CNRS), Mathématiques et Informatique Appliquées du Génome à l'Environnement [Jouy-En-Josas] (MaIAGE), Institut National de la Recherche Agronomique (INRA), and Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-ENSIIE-Centre National de la Recherche Scientifique (CNRS)
- Subjects
meta model ,Reproducing Kernel Hilbert Spaces ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Computer Science::Mathematical Software ,Hoeffding decomposition ,optimization problem ,Sobol indices - Abstract
In the context of the Gaussian regression model, the package RKHSMetaMod allows to estimate a meta model by solving the ridge group sparse optimization problem based on the Reproducing Kernel Hilbert Spaces (RKHS). The obtained estimator is an additive model that satisfies the properties of the Hoeffding decomposition, and its terms estimate the terms in the Hoeffding decomposition of the unknown regression function. The estimators of the Sobol indices are deduced from the estimated meta model. This package provides an interface from R statistical computing environment to the C++ libraries Eigen and GSL. In order to speed up the execution time, almost all of the functions of the RKHSMetaMod package are written using the efficient C++ libraries through RcppEigen and RcppGSL packages. These functions are then interfaced in the R environment in order to propose an user friendly package.
- Published
- 2019
26. Spatial and dynamic sensitivity analysis of a biophysical model of nitrogen transfers and transformations at the landscape scale
- Author
-
FERRER SAVALL, Jordi, BARBILLON, Pierre, Benhamou, Cyril, Durand, Patrick, Taupin, Marie-Luce, Monod, Herve, Drouet, Jean-Louis, Ecologie fonctionnelle et écotoxicologie des agroécosystèmes (ECOSYS), Institut National de la Recherche Agronomique (INRA)-AgroParisTech, Mathématiques et Informatique Appliquées (MIA-Paris), AgroParisTech-Institut National de la Recherche Agronomique (INRA), Université Paris-Saclay, Sol Agro et hydrosystème Spatialisation (SAS), Institut National de la Recherche Agronomique (INRA)-AGROCAMPUS OUEST, Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro)-Institut national d'enseignement supérieur pour l'agriculture, l'alimentation et l'environnement (Institut Agro), Mathématiques et Informatique Appliquées du Génome à l'Environnement [Jouy-En-Josas] (MaIAGE), Institut National de la Recherche Agronomique (INRA), and International Environmental Modelling and Software Society (iEMSs). Toulouse, FRA.
- Subjects
sensitivity analysis ,spatially distributed model ,[SDV]Life Sciences [q-bio] ,landscape scale ,N cascade ,cluster analysis - Abstract
oral communication, texte intégral; Modelling complex systems such as agroecosystems often requires the quantification of a large number of input factors. Sensitivity analyses are useful to fix the appropriate spatial and temporal resolution of models and to reduce the number of input factors to be measured or estimated accurately. Comprehensive spatial and dynamic sensitivity analyses were applied to the Nitroscape model, a deterministic spatially distributed model describing nitrogen transfers and transformations in a rural landscape. Simulations were led on a virtual landscape that represented five years of farm management in an intensive rural area of 3 km². Cluster analyses were applied to summarize the results of the sensitivity analysis on the ensemble of model outcomes. The 29 studied output variables were split into five different clusters that grouped outcomes with similar response to input factors. Among the 11 studied factors, model outcomes were mainly sensitive to inputs characterizing the lateral transmissivity of soil. The horizontal resolution of the model was a significant factor driving ammonium and nitrate mineralisation, and uptake by plants. The vertical resolution of the model had the highest impact on the cumulate emissions of nitrous oxides. The interactions between the amount of nitrogen used in fertilization and the lateral transmissivity of soil was the most important factorial effect driving the amount of nitrogen in the catchment discharge.
- Published
- 2016
27. Adaptive estimation of the baseline hazard function in the Cox model by model selection, with high-dimensional covariates
- Author
-
Guilloux, Agathe, primary, Lemler, Sarah, additional, and Taupin, Marie-Luce, additional
- Published
- 2016
- Full Text
- View/download PDF
28. Adaptive density deconvolution for dependent inputs with measurement errors
- Author
-
Comte, Fabienne, Taupin, Marie-Luce, Dedecker, Jérôme, Mathématiques Appliquées Paris 5 (MAP5 - UMR 8145), Université Paris Descartes - Paris 5 (UPD5)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Statistique Théorique et Appliquée (LSTA), Université Pierre et Marie Curie - Paris 6 (UPMC), Mathématiques Appliquées à Paris 5 ( MAP5 - UMR 8145 ), Université Paris Descartes - Paris 5 ( UPD5 ) -Institut National des Sciences Mathématiques et de leurs Interactions-Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Statistique Théorique et Appliquée ( LSTA ), Université Pierre et Marie Curie - Paris 6 ( UPMC ), and Université Pierre et Marie Curie - Paris 6 (UPMC)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,ComputingMilieux_MISCELLANEOUS - Abstract
International audience
- Published
- 2008
29. New M-estimators in semiparametric regression with errors in variables
- Author
-
Butucea, Cristina, Taupin, Marie-Luce, Modélisation aléatoire de Paris X (MODAL'X), Université Paris Nanterre (UPN), Laboratoire de Probabilités et Modèles Aléatoires (LPMA), Centre National de la Recherche Scientifique (CNRS)-Université Paris Diderot - Paris 7 (UPD7)-Université Pierre et Marie Curie - Paris 6 (UPMC), Laboratoire de Mathématiques d'Orsay (LM-Orsay), Centre National de la Recherche Scientifique (CNRS)-Université Paris-Sud - Paris 11 (UP11), Mathématiques Appliquées Paris 5 (MAP5 - UMR 8145), Université Paris Descartes - Paris 5 (UPD5)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Centre National de la Recherche Scientifique (CNRS), Modélisation aléatoire de Paris X ( MODAL'X ), Université Paris Nanterre ( UPN ), Laboratoire de Probabilités et Modèles Aléatoires ( LPMA ), Université Pierre et Marie Curie - Paris 6 ( UPMC ) -Université Paris Diderot - Paris 7 ( UPD7 ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Mathématiques d'Orsay ( LM-Orsay ), Université Paris-Sud - Paris 11 ( UP11 ) -Centre National de la Recherche Scientifique ( CNRS ), Mathématiques Appliquées à Paris 5 ( MAP5 - UMR 8145 ), Université Paris Descartes - Paris 5 ( UPD5 ) -Institut National des Sciences Mathématiques et de leurs Interactions-Centre National de la Recherche Scientifique ( CNRS ), Université Pierre et Marie Curie - Paris 6 (UPMC)-Université Paris Diderot - Paris 7 (UPD7)-Centre National de la Recherche Scientifique (CNRS), and Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
consistency ,ordinary smooth and super-smooth functions ,rates of convergence ,semiparametric nonlinear regression ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,M-estimators ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,deconvolution kernel estimator ,2000 MSC, Primary 62J02, 62F12, Secondary 62G05, 62G20 ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,errors-in-variables model ,Asymptotic normality ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] - Abstract
In the regression model with errors in variables, we observe $n$ i.i.d. copies of $(Y,Z)$ satisfying $Y=f_{\theta^0}(X)+\xi$ and $Z=X+\varepsilon$ involving independent and unobserved random variables $X,\xi,\varepsilon$ plus a regression function $f_{\theta^0}$, known up to some finite dimensional $\theta^0$. The common densities of the $X_i$'s and of the $\xi_i$'s are unknown whereas the distribution of $\varepsilon$ is completely known. We aim at estimating the parameter $\theta^0$ by using the observations $(Y_1,Z_1),\cdots, (Y_n,Z_n)$. We propose two estimation procedures based on the least square criterion $\tilde S_{\theta^0,g}(\theta)=\mathbb{E}_{\theta^0,g}[((Y-f_\theta(X))^2w(X)]$ where $w$ is some weight function, to be chosen. In the first estimation procedure, $w$ does not depend on $\theta$ and the distribution of the $\xi$'s is unknown. The second estimation procedure is based on $S_{\theta^0,g}(\theta)=\mathbb{E}_{\theta^0,g}[((Y-f_\theta(X))^2-\sigma_{\xi,2}^2)w_\theta(X)]$ where $w_\theta$ is positive weight function, to be chosen, and requires the knowledge of $\sigma_{\xi,2}^2=\mbox{Var}(\xi)$. In both cases, we propose two estimators and derive upper bounds for the risk of those estimators, depending on the smoothness of the errors density $p_\varepsilon$ and on the smoothness properties of $w(x)f_\theta(x)$ or $w_\theta(x)f_\theta(x)$ with respect to $x$. Furthermore we give sufficient conditions that ensure that the parametric rate of convergence is achieved. We provide practical recipes for the choice of $w$ or $ w_\theta$ in the case of nonlinear regressionfunctions which are smooth on pieces allowing to gain in the order of the rate of convergence, up to the parametric rate in some cases.
- Published
- 2008
30. Nonparametric Estimation of the Regression Function in an Errors-in-Variables Model
- Author
-
Comte , Fabienne, Taupin , Marie-Luce, Mathématiques Appliquées Paris 5 (MAP5 - UMR 8145), Université Paris Descartes - Paris 5 (UPD5)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Mathématiques d'Orsay (LM-Orsay), Université Paris-Sud - Paris 11 (UP11)-Centre National de la Recherche Scientifique (CNRS), Mathématiques Appliquées à Paris 5 ( MAP5 - UMR 8145 ), Université Paris Descartes - Paris 5 ( UPD5 ) -Institut National des Sciences Mathématiques et de leurs Interactions-Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Mathématiques d'Orsay ( LM-Orsay ), and Université Paris-Sud - Paris 11 ( UP11 ) -Centre National de la Recherche Scientifique ( CNRS )
- Subjects
Projection estimators ,Mathematics - Statistics Theory ,Nonparametric regression ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,Statistics Theory (math.ST) ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,Errors-in-variables ,Density deconvolution ,(Secondary) 62G05, 62G20 ,MSC 2000 Primary 62G08, 62G07. Secondary 62G05, 62G20 ,Adaptive estimation ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,FOS: Mathematics ,Minimax estimation ,(Primary) 62G08, 62G07 ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] - Abstract
We consider the regression model with errors-in-variables where we observe $n$ i.i.d. copies of $(Y,Z)$ satisfying $Y=f(X)+\xi, \; Z=X+\sigma\varepsilon$, involving independent and unobserved random variables $X,\xi,\varepsilon$. The density $g$ of $X$ is unknown, whereas the density of $\sigma\varepsilon$ is completely known. Using the observations $(Y_i, Z_i)$, $i=1,\cdots,n$, we propose an estimator of the regression function $f$, built as the ratio of two penalized minimum contrast estimators of $\ell=fg$ and $g$, without any prior knowledge on their smoothness. We prove that its $\mathbb{L}_2$-risk on a compact set is bounded by the sum of the two $\mathbb{L}_2(\mathbb{R})$-risks of the estimators of $\ell$ and $g$, and give the rate of convergence of such estimators for various smoothness classes for $\ell$ and $g$, when the errors $\varepsilon$ are either ordinary smooth or super smooth. The resulting rate is optimal in a minimax sense in all cases where lower bounds are available.
- Published
- 2005
- Full Text
- View/download PDF
31. Estimation in autoregressive model with measurement error
- Author
-
Dedecker, Jérôme, primary, Samson, Adeline, additional, and Taupin, Marie-Luce, additional
- Published
- 2014
- Full Text
- View/download PDF
32. Estimation semi-paramétrique pour le modèle de régression non linéaire avec erreurs sur les variables
- Author
-
Taupin, Marie-Luce, Unité de biométrie et intelligence artificielle de jouy, and Institut National de la Recherche Agronomique (INRA)
- Subjects
[INFO]Computer Science [cs] ,these ,[MATH]Mathematics [math] ,Estimation semi-paramétrique ,fonction analytique ,estimation d'une densité ,risque minimax ,estimation non paramétrique ,transformées de Fourier ,régression non linéaire avec erreurs sur les variables - Abstract
Dans un modèle de régression non linéaire avec erreurs sur les variables, on suppose les variables explicatives sont des variables aléatoires réelles indépendantes, de densité inconnue, qui sont observées à une erreur additive indépendantes et gaussienne près. La fonction de régression est connue à une paramètre fini-dimensionnel près. L'objectif est d'estimer ce paramètre dans ce modèle semi-paramétrique. Nous procédons en deux étapes. Le chapitre 2 est consacrée à l'estimation de fonctionnelles linéaires intégrales d'une densité dans le modèle de convolution. En particulier nous établissons une borne inférieure du risque quadratique minimax pour l'estimation d'une densité en un point sur la classe des densités obtenues par convolution avec la densité gaussienne standard. Dans le chapitre 3, en utilisant les résultats précédents, nous proposons un critère des moindres carrés modifié, basé sur l'estimation d'une espérance conditionnelle dépendant de la densité inconnue des variables explicatives. Nous montrons que l'estimateur obtenu par minimisation du critère ainsi construit est consistant et que sa vitesse de convergence est d'autant plus rapide que la fonction de régression admet de fortes propriétés de régularité (par rapport aux variables explicatives), et qu'elle est généralement plus lente que la vitesse paramétrique tex2html_wrap_inline25 . Néanmoins elle est d'ordre tex2html_wrap_inline27 pour un certain nombre de fonctions de régressions admettant un prolongement analytique sur le plan complexe., In the non linear errors-in-variables model, we suppose that the explanatory variables are real random variables independently distributed with common unknown density and are observed with a Gaussian additive error which is supposed to be independent from the explanatory variables. The regression function is known up to a finite dimensional parameter. We aim to estimate the unknown parameter in this semi-parametric model. We proceed in two steps. The chapter 2 is devoted to the estimation of some linear functional integrals of a density in the convolution model. In particular we establish a lower bound of the minimax quadratic risk for estimating the value of a density at a point over the class of densities obtained by convolution with the standard Gaussian density. In chapter 3, using the preceeding results, we propose a modified least squares criterion based on the estimation of a conditional expectation depending on the unknown density of the explanatory variables. We show that the estimator of parameters obtained by minimization of this criterion is consistent and that its rate of convergence is strongly related to the regularity of the regression function. The more regular the regression function (with respect to the explanatory variables) the faster the rate is. This rate is generally slower than the parametric rate. Nevertheless it is very close to this parametric rate for some regression functions admitting an analytic continuation on the complex plane.
- Published
- 1998
33. Comment on identification and estimation of nonlinear models using two samples with nonclassical measurement errors
- Author
-
Taupin, Marie-Luce, primary
- Published
- 2010
- Full Text
- View/download PDF
34. Estimation of the hazard function in a semiparametric model with covariate measurement error
- Author
-
Martin-Magniette, Marie-Laure, primary and Taupin, Marie-Luce, additional
- Published
- 2009
- Full Text
- View/download PDF
35. New M-estimators in semi-parametric regression with errors in variables
- Author
-
Butucea, Cristina, primary and Taupin, Marie-Luce, additional
- Published
- 2008
- Full Text
- View/download PDF
36. Penalized contrast estimator for adaptive density deconvolution
- Author
-
Comte, Fabienne, primary, Rozenholc, Yves, additional, and Taupin, Marie-Luce, additional
- Published
- 2006
- Full Text
- View/download PDF
37. Estimation in the nonlinear errors-in-variables model
- Author
-
Taupin, Marie-Luce, primary
- Published
- 1998
- Full Text
- View/download PDF
38. Metamodel construction for sensitivity analysis
- Author
-
Marie-Luce Taupin, Sylvie Huet, Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-ENSIIE-Centre National de la Recherche Scientifique (CNRS), Mathématiques et Informatique Appliquées du Génome à l'Environnement [Jouy-En-Josas] (MaIAGE), Institut National de la Recherche Agronomique (INRA), Taupin, Marie-Luce, Laboratoire de Mathématiques et Modélisation d'Evry, and Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
T57-57.97 ,Statistics::Theory ,Applied mathematics. Quantitative methods ,Computer science ,Gaussian ,Hilbert space ,Estimator ,Mathematics - Statistics Theory ,Regression analysis ,Multivariate normal distribution ,Statistics Theory (math.ST) ,Metamodeling ,Statistics::Computation ,symbols.namesake ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,QA1-939 ,symbols ,FOS: Mathematics ,Statistiques (Mathématiques) ,Sensitivity (control systems) ,[MATH.MATH-ST] Mathematics [math]/Statistics [math.ST] ,Algorithm ,Mathematics ,Reproducing kernel Hilbert space - Abstract
We propose to estimate a metamodel and the sensitivity indices of a complex model m in the Gaussian regression framework. Our approach combines methods for sensitivity analysis of complex models and statistical tools for sparse non-parametric estimation in multivariate Gaussian regression model. It rests on the construction of a metamodel for aproximating the Hoeffding-Sobol decomposition of m. This metamodel belongs to a reproducing kernel Hilbert space constructed as a direct sum of Hilbert spaces leading to a functional ANOVA decomposition. The estimation of the metamodel is carried out via a penalized least-squares minimization allowing to select the subsets of variables that contribute to predict the output. It allows to estimate the sensitivity indices of m. We establish an oracle-type inequality for the risk of the estimator, describe the procedure for estimating the metamodel and the sensitivity indices, and assess the performances of the procedure via a simulation study., Nous considérons l’estimation d’un méta-modèle d’un modèle complexe m à partir des observations d’un n-échantillon dans un modèle de régression gaussien. Nous en déduisons une estimation des indices de sensibilité de m. Notre approche combine les méthodes d’analyse de sensibilité de modèles complexes et les outils statistiques de l’estimation non-paramétrique en régression multivariée. Elle repose sur la construction d’un méta-modèle qui approche la décomposition de Hoeffding-Sobol de m. Ce méta-modèle appartient à un espace de Hilbert à noyau reproduisant qui est lui-même la somme directe d’espaces de Hilbert, permettant ainsi une décomposition de type ANOVA. On en déduit des estimateurs des indices de sensibilité de m. Nous établissons une inégalité de type oracle pour le risque de l’estimateur, nous décrivons la procédure pour estimer le méta-modèle et les indices de sensibilité, et évaluons les performances de notre méthode à l’aide d’une étude de simulations.
- Published
- 2017
- Full Text
- View/download PDF
39. Estimation in autoregressive model with measurement error
- Author
-
Marie-Luce Taupin, Adeline Samson, Jérôme Dedecker, Mathématiques Appliquées Paris 5 (MAP5 - UMR 8145), Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Université Paris Descartes - Paris 5 (UPD5), Laboratoire Statistique et Génome (SG), Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS), Université Paris Descartes - Paris 5 (UPD5)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Mathématiques et Modélisation d'Evry, Mathématiques Appliquées à Paris 5 ( MAP5 - UMR 8145 ), Université Paris Descartes - Paris 5 ( UPD5 ) -Institut National des Sciences Mathématiques et de leurs Interactions-Centre National de la Recherche Scientifique ( CNRS ), Laboratoire Statistique et Génome ( SG ), Institut National de la Recherche Agronomique ( INRA ) -Université d'Évry-Val-d'Essonne ( UEVE ) -Centre National de la Recherche Scientifique ( CNRS ), Taupin, Marie-Luce, and Laboratoire de Mathématiques et Modélisation d'Evry (LaMME)
- Subjects
Statistics and Probability ,Statistics::Theory ,Weight function ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,deconvolution ,semi-parametric model ,01 natural sciences ,Combinatorics ,010104 statistics & probability ,Mixing (mathematics) ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,FOS: Mathematics ,mixing ,[INFO]Computer Science [cs] ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,[MATH]Mathematics [math] ,0101 mathematics ,Mathematics ,Observational error ,Smoothness (probability theory) ,Primary 62J02, 62F12, Secondary 62G05, 62G20 ,010102 general mathematics ,Estimator ,semi-parametric nonlinear autoregressive model ,autoregressive model ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,Semiparametric model ,Distribution (mathematics) ,Autoregressive model ,weak dependent ,markov chain ,Markov chain - Abstract
International audience; Consider an autoregressive model with measurement error: we observe $Z_i=X_i+\varepsilon_i$, where $X_i$ is a stationary solution of the equation $X_i=f_{\theta^0}(X_{i-1})+\xi_i$. The regression function $f_{\theta^0}$ is known up to a finite dimensional parameter $\theta^0$. The distributions of $X_0$ and $\xi_1$ are unknown whereas the distribution of $\varepsilon_1$ is completely known. We want to estimate the parameter $\theta^0$ by using the observations $Z_0,\ldots,Z_n$. We propose an estimation procedure based on a modified least square criterion involving a weight function $w$, to be suitably chosen. We give upper bounds for the risk of the estimator, which depend on the smoothness of the errors density $f_\varepsilon$ and on the smoothness properties of $w f_\theta$.
- Published
- 2014
- Full Text
- View/download PDF
40. Estimation of the hazard function in a semiparametric model with covariate measurement error
- Author
-
Marie-Laure Martin-Magniette, Marie-Luce Taupin, Mathématiques et Informatique Appliquées (MIA-Paris), AgroParisTech-Institut National de la Recherche Agronomique (INRA), Unité de recherche en génomique végétale (URGV), Centre National de la Recherche Scientifique (CNRS)-Université d'Évry-Val-d'Essonne (UEVE)-Institut National de la Recherche Agronomique (INRA), Mathématiques Appliquées Paris 5 (MAP5 - UMR 8145), Centre National de la Recherche Scientifique (CNRS)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Université Paris Descartes - Paris 5 (UPD5), Institut National de la Recherche Agronomique (INRA)-Université d'Évry-Val-d'Essonne (UEVE)-Centre National de la Recherche Scientifique (CNRS), Université Paris Descartes - Paris 5 (UPD5)-Institut National des Sciences Mathématiques et de leurs Interactions (INSMI)-Centre National de la Recherche Scientifique (CNRS), Mathématiques et Informatique Appliquées ( MIA-Paris ), Institut National de la Recherche Agronomique ( INRA ) -AgroParisTech, URGV UMR INRA 1165/CNRS 8114/UEVE, Université d'Évry-Val-d'Essonne ( UEVE ), Mathématiques Appliquées à Paris 5 ( MAP5 - UMR 8145 ), Université Paris Descartes - Paris 5 ( UPD5 ) -Institut National des Sciences Mathématiques et de leurs Interactions-Centre National de la Recherche Scientifique ( CNRS ), Institut National de la Recherche Agronomique (INRA)-AgroParisTech, and Taupin, Marie-Luce
- Subjects
Statistics and Probability ,Smoothness (probability theory) ,05 social sciences ,Estimator ,Semiparametric estimation ,errors-in-variables model ,measurement error ,nonparametric estimation ,excess risk model ,Cox model ,censoring ,survival analysis ,density deconvolution ,least square criterion ,emiparametric estimation ,cox model ,01 natural sciences ,Censoring (statistics) ,Upper and lower bounds ,Semiparametric model ,Combinatorics ,010104 statistics & probability ,0502 economics and business ,Consistent estimator ,Covariate ,Statistics ,[INFO]Computer Science [cs] ,0101 mathematics ,[MATH]Mathematics [math] ,Random variable ,Mathematics ,050205 econometrics - Abstract
We consider a failure hazard function, conditional on a time-independent covariate Z, given by eta (gamma 0) (t) f (beta 0) (Z). The baseline hazard function eta (gamma 0) and the relative risk f (beta 0) both belong to parametric families with theta (0) = (beta(0), gamma(0)) (inverted perpendicular) is an element of R (m+p). The covariate Z has an unknown density and is measured with an error through an additive error model U = Z + epsilon where e is a random variable, independent from Z, with known density f (epsilon). We observe a n-sample (X(i), D(i), U(i)), i = 1, ... , n, where X (i) is the minimum between the failure time and the censoring time, and D (i) is the censoring indicator. Using least square criterion and deconvolution methods, we propose a consistent estimator of theta(0) using the observations (X(i), D(i), U(i)), i = 1, ... , n. We give an upper bound for its risk which depends on the smoothness properties of f (epsilon) and f (beta) (z) as a function of z, and we derive sufficient conditions for the root n-consistency. We give detailed examples considering various type of relative risks f (beta) and various types of error density f (epsilon). In particular, in the Cox model and in the excess risk model, the estimator of theta (0) is root n-consistent and asymptotically Gaussian regardless of the form of f (epsilon).
- Published
- 2009
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.