20 results on '"Peggy Cénac"'
Search Results
2. Variable Length Memory Chains: Characterization of stationary probability measures
- Author
-
Brigitte Chauvin, Peggy Cénac, Camille Noûs, Nicolas Pouyanne, Frédéric Paccaut, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Laboratoire de Mathématiques de Versailles (LMV), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Cogitamus Laboratory, Laboratoire Amiénois de Mathématique Fondamentale et Appliquée - UMR CNRS 7352 (LAMFA), and Université de Picardie Jules Verne (UPJV)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Statistics and Probability ,Pure mathematics ,Longest Internal Suffix ,Stationary distribution ,Markov chain ,60J05, 60C05, 60G10 ,Probability (math.PR) ,010102 general mathematics ,01 natural sciences ,Measure (mathematics) ,Variable Length Memory Chains ,010104 statistics & probability ,Probability theory ,Convergence of random variables ,FOS: Mathematics ,Countable set ,State space ,Renewal theory ,[MATH]Mathematics [math] ,0101 mathematics ,stable context trees ,semi-Markov chains ,Mathematics - Probability ,stationary probability measure ,Mathematics - Abstract
Variable Length Memory Chains (VLMC), which are generalizations of finite order Markov chains, turn out to be an essential tool to modelize random sequences in many domains, as well as an interesting object in contemporary probability theory. The question of the existence of stationary probability measures leads us to introduce a key combinatorial structure for words produced by a VLMC: the Longest Internal Suffix. This notion allows us to state a necessary and sufficient condition for a general VLMC to admit a unique invariant probability measure. This condition turns out to get a much simpler form for a subclass of VLMC: the stable VLMC. This natural subclass, unlike the general case, enjoys a renewal property. Namely, a stable VLMC induces a semi-Markov chain on an at most countable state space. Unfortunately, this discrete time renewal process does not contain the whole information of the VLMC, preventing the study of a stable VLMC to be reduced to the study of its induced semi-Markov chain. For a subclass of stable VLMC, the convergence in distribution of a VLMC towards its stationary probability measure is established. Finally, finite state space semi-Markov chains turn out to be very special stable VLMC, shedding some new light on their limit distributions., Comment: arXiv admin note: text overlap with arXiv:1807.01075
- Published
- 2021
3. Persistent Random Walks. I. Recurrence Versus Transience
- Author
-
Yoann Offret, Peggy Cénac, Arnaud Le Ny, Basile de Loynes, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Laboratoire d'Analyse et de Mathématiques Appliquées (LAMA), Centre National de la Recherche Scientifique (CNRS)-Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12)-Fédération de Recherche Bézout-Université Paris-Est Marne-la-Vallée (UPEM), Institut de Recherche Mathématique Avancée (IRMA), Université de Strasbourg (UNISTRA)-Centre National de la Recherche Scientifique (CNRS), Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] (ENSAI), PRES Université Paris-Est, Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS), Université Paris-Est Marne-la-Vallée (UPEM)-Fédération de Recherche Bézout-Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12)-Centre National de la Recherche Scientifique (CNRS), Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC), Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire d'Analyse et de Mathématiques Appliquées ( LAMA ), Université Paris-Est Marne-la-Vallée ( UPEM ) -Fédération de Recherche Bézout-Université Paris-Est Créteil Val-de-Marne - Paris 12 ( UPEC UP12 ) -Centre National de la Recherche Scientifique ( CNRS ), Institut de Recherche Mathématique Avancée ( IRMA ), Université de Strasbourg ( UNISTRA ) -Centre National de la Recherche Scientifique ( CNRS ), and Université de Bourgogne (UB)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Statistics and Probability ,Class (set theory) ,Variable length memory ,General Mathematics ,Characterization (mathematics) ,Skeleton (category theory) ,01 natural sciences ,Combinatorics ,010104 statistics & probability ,Law of large numbers ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,0101 mathematics ,Mathematics ,Discrete mathematics ,Markov chain ,010102 general mathematics ,Random walk ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,MSC: 60G50, 60J15, 60G17, 60J05, 37B20, 60K35 ,Distribution (mathematics) ,Line (geometry) ,Random walk with undefined mean ,Recurrence and transience ,Statistics, Probability and Uncertainty ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] ,Persistent and directionally reinforced random walks - Abstract
International audience; We consider a walker on the line that at each step keeps the same direction with a probability which depends on the time already spent in the direction the walker is currently moving. These walks with memories of variable length can be seen as generalizations of directionally reinforced random walks introduced in Mauldin et al. (Adv Math 117(2):239–252, 1996). We give a complete and usable characterization of the recurrence or transience in terms of the probabilities to switch the direction and we formulate some laws of large numbers. The most fruitful situation emerges when the running times both have an infinite mean. In that case, these properties are related to the behaviour of some embedded random walk with an undefined drift so that these features depend on the asymptotics of the distribution tails related to the persistence times. In the other case, the criterion reduces to a null-drift condition. Finally, we deduce some criteria for a wider class of persistent random walks whose increments are encoded by a variable length Markov chain having—in full generality—no renewal pattern in such a way that their study does not reduce to a skeleton RW as for the original model.
- Published
- 2018
4. Characterization of stationary probability measures for Variable Length Markov Chains
- Author
-
Peggy Cénac, Brigitte Chauvin, Frédéric Paccaut, Nicolas Pouyanne, Paccaut, Frédéric, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Laboratoire de Mathématiques de Versailles (LMV), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Laboratoire Amiénois de Mathématique Fondamentale et Appliquée (LAMFA), Université de Picardie Jules Verne (UPJV)-Centre National de la Recherche Scientifique (CNRS), and Laboratoire Amiénois de Mathématique Fondamentale et Appliquée - UMR CNRS 7352 (LAMFA)
- Subjects
[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,[MATH.MATH-PR] Mathematics [math]/Probability [math.PR] ,60J05, 60C05, 60G10 ,Probability (math.PR) ,FOS: Mathematics ,Mathematics - Probability - Abstract
By introducing a key combinatorial structure for words produced by a Variable Length Markov Chain (VLMC), the longest internal suffix, precise characterizations of existence and uniqueness of a stationary probability measure for a VLMC chain are given. These characterizations turn into necessary and sufficient conditions for VLMC associated to a subclass of probabilised context trees: the shift-stable context trees. As a by-product, we prove that a VLMC chain whose stabilized context tree is again a context tree has at most one stationary probability measure., 32 pages
- Published
- 2018
5. Online estimation of the geometric median in Hilbert spaces: Nonasymptotic confidence balls
- Author
-
Hervé Cardot, Peggy Cénac, Antoine Godichon-Baggioni, Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), and Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB)
- Subjects
Statistics and Probability ,Clustering high-dimensional data ,[ MATH ] Mathematics [math] ,0209 industrial biotechnology ,Martingales in Hilbert spaces ,Robust statistics ,02 engineering and technology ,Stochastic approximation ,Stochastic-approximation ,01 natural sciences ,Separable space ,010104 statistics & probability ,symbols.namesake ,020901 industrial engineering & automation ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Applied mathematics ,62G05 ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,[MATH]Mathematics [math] ,0101 mathematics ,62L20 ,MSC : Primary 62G05 ,secondary 62L20 ,Mathematics ,Spatial median ,Recursive estimation ,Hilbert space ,Estimator ,Stochastic gradient algorithms ,Geometric median ,Functional data ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,Functional data analysis ,symbols ,Statistics, Probability and Uncertainty ,Martingale (probability theory) ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] - Abstract
International audience; Estimation procedures based on recursive algorithms are interesting and powerful techniques that are able to deal rapidly with very large samples of high dimensional data. The collected data may be contaminated by noise so that robust location indicators, such as the geometric median, may be prefered to the mean. In this context, an estimator of the geometric median based on a fast and efficient averaged nonlinear stochastic gradient algorithm has been developed by [Bernoulli 19 (2013) 18-43]. This work aims at studying more precisely the nonasymptotic behavior of this nonlinear algorithm by giving nonasymptotic confidence balls in general separable Hilbert spaces. This new result is based on the derivation of improved $L^2$ rates of convergence as well as an exponential inequality for the nearly martingale terms of the recursive nonlinear Robbins-Monro algorithm.
- Published
- 2017
6. Probability and algorithmics: a focus on some recent developments
- Author
-
Christelle Rovetta, Peggy Cénac, Mathieu Sablik, Rémi Varloot, Irène Marcovici, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Institut Élie Cartan de Lorraine (IECL), Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Département d'informatique de l'École normale supérieure (DI-ENS), École normale supérieure - Paris (ENS Paris), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS), Institut de Mathématiques de Toulouse UMR5219 (IMT), Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS), Alcatel-Lucent Bell Labs France [Nozay], Alcatel-Lucent Bell Labs France, Université de Bourgogne (UB)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Centre National de la Recherche Scientifique (CNRS), Département d'informatique - ENS Paris (DI-ENS), École normale supérieure - Paris (ENS-PSL), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), and Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
[INFO.INFO-CC]Computer Science [cs]/Computational Complexity [cs.CC] ,T57-57.97 ,Focus (computing) ,Applied mathematics. Quantitative methods ,Theoretical computer science ,Markov chain ,Computer science ,[MATH.MATH-DS]Mathematics [math]/Dynamical Systems [math.DS] ,[INFO.INFO-DM]Computer Science [cs]/Discrete Mathematics [cs.DM] ,Variable length ,Random walk ,Cellular automaton ,[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] ,Perfect sampling ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,Coupling from the past ,[INFO.INFO-IT]Computer Science [cs]/Information Theory [cs.IT] ,[INFO.INFO-MA]Computer Science [cs]/Multiagent Systems [cs.MA] ,Algorithmics ,[MATH.MATH-CO]Mathematics [math]/Combinatorics [math.CO] ,QA1-939 ,Mathematics - Abstract
Jean-François Coeurjolly, Adeline Leclercq-Samson Eds.; International audience; This article presents different recent theoretical results illustrating the interactions between probability and algorithmics. These contributions deal with various topics: cellular automata and calculability, variable length Markov chains and persistent random walks, perfect sampling via coupling from the past. All of them involve discrete dynamics on complex random structures.; Cet article présente différents résultats récents de nature théorique illustrant les interactions entre probabilités et algorithmique. Ces contributions traitent de sujets variés : automates cellulaires et calculabilité, chaînes de Markov à mémoire variable et marches aléatoires persistantes, échantillonnage parfait par la méthode de couplage par le passé. Leur point commun est de faire intervenir des dynamiques discrètes sur des structures aléatoires complexes.
- Published
- 2017
7. On the convergence of moments in the almost sure central limit theorem for stochastic approximation algorithms
- Author
-
Peggy Cénac
- Subjects
Statistics and Probability ,Quadratic equation ,Real-valued function ,Law of large numbers ,Convergence (routing) ,Mathematical analysis ,Zero (complex analysis) ,Stochastic approximation ,Algorithm ,Central limit theorem ,Mathematics ,Quantile - Abstract
We study the almost sure asymptotic behaviour of stochastic approximation algorithms for the search of zero of a real function. The quadratic strong law of large numbers is extended to the powers greater than one. In other words, the convergence of moments in the almost sure central limit theorem (ASCLT) is established. As a by-product of this convergence, one gets another proof of ASCLT for stochastic approximation algorithms. The convergence result is applied to several examples as estimation of quantiles and recursive estimation of the mean.
- Published
- 2013
8. Persistent random walks. II. Functional Scaling Limits
- Author
-
Basile de Loynes, Yoann Offret, Arnaud Le Ny, Peggy Cénac, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Analyse et de Mathématiques Appliquées (LAMA), Université Paris-Est Marne-la-Vallée (UPEM)-Fédération de Recherche Bézout-Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12)-Centre National de la Recherche Scientifique (CNRS), Ecole Nationale de la Statistique et de l'Analyse de l'Information (ENSAI), Ensai, Ecole Nationale de la Statistique et de l'Analyse de l'Information, Université de Bourgogne (UB)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Centre National de la Recherche Scientifique (CNRS), Ecole Nationale de la Statistique et de l'Analyse de l'Information [Bruz] (ENSAI), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Centre National de la Recherche Scientifique (CNRS)-Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12)-Fédération de Recherche Bézout-Université Paris-Est Marne-la-Vallée (UPEM), Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire d'Analyse et de Mathématiques Appliquées ( LAMA ), Université Paris-Est Marne-la-Vallée ( UPEM ) -Fédération de Recherche Bézout-Université Paris-Est Créteil Val-de-Marne - Paris 12 ( UPEC UP12 ) -Centre National de la Recherche Scientifique ( CNRS ), and Ecole Nationale de la Statistique et de l'Analyse de l'Information ( ENSAI )
- Subjects
Statistics and Probability ,Class (set theory) ,Arcsine Lamperti laws ,General Mathematics ,Directionally reinforced random walks ,Anomalous diffusions ,Anomalous diffusion ,Markov process ,Functional scaling limits ,Functional scaling limit ,01 natural sciences ,Lévy process ,010104 statistics & probability ,symbols.namesake ,Persistent random walks ,Mathematics::Probability ,Lévy walks ,FOS: Mathematics ,60F17 . 60G50 . 60J15 . 60G17 . 60J05 . 60G22 . 60K20 ,Statistical physics ,0101 mathematics ,Scaling ,Mathematics ,MSC: 60F17, 60G50, 60J15, 60G17, 60J05, 60G22, 60K20 ,Arcsine Lamperti marginal distributions ,Directionally reinforced random walk ,Stochastic process ,010102 general mathematics ,Mathematical analysis ,Probability (math.PR) ,Cauchy distribution ,Lévy walk ,Random walk ,Connection (mathematics) ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,symbols ,Statistics, Probability and Uncertainty ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] ,Mathematics - Probability ,Persistent random walk - Abstract
We give a complete and unified description -- under some stability assumptions -- of the functional scaling limits associated with some persistent random walks for which the recurrent or transient type is studied in [1]. As a result, we highlight a phase transition phenomenon with respect to the memory. It turns out that the limit process is either Markovian or not according to -- to put it in a nutshell -- the rate of decrease of the distribution tails corresponding to the persistent times. In the memoryless situation, the limits are classical strictly stable L{\'e}vy processes of infinite variations. However, we point out that the description of the critical Cauchy case fills some lacuna even in the closely related context of Directionally Reinforced Random Walks (DRRWs) for which it has not been considered yet. Besides, we need to introduced some relevant generalized drift -- extended the classical one -- in order to study the critical case but also the situation when the limit is no longer Markovian. It appears to be in full generality a drift in mean for the Persistent Random Walk (PRW). The limit processes keeping some memory -- given by some variable length Markov chain -- of the underlying PRW are called arcsine Lamperti anomalous diffusions due to their marginal distribution which are computed explicitly here. To this end, we make the connection with the governing equations for L{\'e}vy walks, the occupation times of skew Bessel processes and a more general class modelled on Lamperti processes. We also stress that we clarify some misunderstanding regarding this marginal distribution in the framework of DRRWs. Finally, we stress that the latter situation is more flexible -- as in the first paper -- in the sense that the results can be easily generalized to a wider class of PRWs without renewal pattern.
- Published
- 2016
- Full Text
- View/download PDF
9. Some multivariate risk indicators: Minimization by using a Kiefer–Wolfowitz approach to the mirror stochastic algorithm
- Author
-
Véronique Maume-Deschamps, Peggy Cénac, Clémentine Prieur, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Sciences Actuarielle et Financière (SAF), Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Université de Lyon, Modelling, Observations, Identification for Environmental Sciences (MOISE), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK), Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS), ANR-08-BLAN-0314,AST&Risk(2008), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Sciences Actuarielle et Financière ( SAF ), Université Claude Bernard Lyon 1 ( UCBL ), Modelling, Observations, Identification for Environmental Sciences ( MOISE ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Laboratoire Jean Kuntzmann ( LJK ), Université Pierre Mendès France - Grenoble 2 ( UPMF ) -Université Joseph Fourier - Grenoble 1 ( UJF ) -Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique ( CNRS ) -Université Grenoble Alpes ( UGA ) -Université Pierre Mendès France - Grenoble 2 ( UPMF ) -Université Joseph Fourier - Grenoble 1 ( UJF ) -Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique ( CNRS ) -Université Grenoble Alpes ( UGA ) -Institut National Polytechnique de Grenoble ( INPG ), ANR : 17252,17252, Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Laboratoire Jean Kuntzmann (LJK), Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Centre National de la Recherche Scientifique (CNRS)-Inria Grenoble - Rhône-Alpes, and Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
- Subjects
Statistics and Probability ,0209 industrial biotechnology ,Multivariate statistics ,Mathematical optimization ,02 engineering and technology ,01 natural sciences ,010104 statistics & probability ,020901 industrial engineering & automation ,Stochastic algorithms ,Risk indicators ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Econometrics ,Total capital ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,0101 mathematics ,Mathematics ,Optimal allocation ,MSC: 62H12, 62L20, 90C15 ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,Constraint (information theory) ,Modeling and Simulation ,Multivariate risk processes ,Minification ,Statistics, Probability and Uncertainty - Abstract
We consider some risk indicators of vectorial risk processes. These indicators take into account the dependencies between business lines as well as some temporal dependencies. By using stochastic algorithms, we may estimate the minimum of these risk indicators, under a fixed total capital constraint. This minimization may apply to capital reserve allocation.
- Published
- 2012
10. On the Almost Sure Central Limit Theorem for Vector Martingales: Convergence of Moments and Statistical Applications
- Author
-
Bernard Bercu, Peggy Cénac, and Guy Fayolle
- Subjects
Statistics and Probability ,Conjecture ,General Mathematics ,Probability (math.PR) ,010102 general mathematics ,01 natural sciences ,010104 statistics & probability ,Probability theory ,Convergence of random variables ,Autoregressive model ,Mathematics::Probability ,Calculus ,FOS: Mathematics ,Increasing process ,Applied mathematics ,0101 mathematics ,Statistics, Probability and Uncertainty ,Martingale (probability theory) ,Mathematics - Probability ,Mathematics ,Branching process ,Central limit theorem - Abstract
We investigate the almost sure asymptotic properties of vector martingale transforms. Assuming some appropriate regularity conditions both on the increasing process and on the moments of the martingale, we prove that normalized moments of any even order converge in the almost sure central limit theorem for martingales. A conjecture about almost sure upper bounds under wider hypotheses is formulated. The theoretical results are supported by examples borrowed from statistical applications, including linear autoregressive models and branching processes with immigration, for which new asymptotic properties are established on estimation and prediction errors.
- Published
- 2009
11. Risk indicators with several lines of business: comparison, asymptotic behavior and applications to optimal reserve allocation
- Author
-
Peggy Cénac, Stéphane Loisel, Véronique Maume-Deschamps, Clémentine Prieur, Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Sciences Actuarielle et Financière ( SAF ), Université Claude Bernard Lyon 1 ( UCBL ), Université de Lyon-Université de Lyon, Institut Camille Jordan [Villeurbanne] ( ICJ ), École Centrale de Lyon ( ECL ), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 ( UCBL ), Université de Lyon-Institut National des Sciences Appliquées de Lyon ( INSA Lyon ), Université de Lyon-Institut National des Sciences Appliquées ( INSA ) -Institut National des Sciences Appliquées ( INSA ) -Université Jean Monnet [Saint-Étienne] ( UJM ) -Centre National de la Recherche Scientifique ( CNRS ), Modelling, Observations, Identification for Environmental Sciences ( MOISE ), Inria Grenoble - Rhône-Alpes, Institut National de Recherche en Informatique et en Automatique ( Inria ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Laboratoire Jean Kuntzmann ( LJK ), Université Pierre Mendès France - Grenoble 2 ( UPMF ) -Université Joseph Fourier - Grenoble 1 ( UJF ) -Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique ( CNRS ) -Université Grenoble Alpes ( UGA ) -Université Pierre Mendès France - Grenoble 2 ( UPMF ) -Université Joseph Fourier - Grenoble 1 ( UJF ) -Institut Polytechnique de Grenoble - Grenoble Institute of Technology-Centre National de la Recherche Scientifique ( CNRS ) -Université Grenoble Alpes ( UGA ) -Institut National Polytechnique de Grenoble ( INPG ), Méthodes d'Analyse Stochastique des Codes et Traitements Numériques ( GdR MASCOT-NUM ), Centre National de la Recherche Scientifique ( CNRS ), Chaire Management de la modélisation, BNP Paribas Cardif, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS), Laboratoire de Sciences Actuarielle et Financière (SAF), Université Claude Bernard Lyon 1 (UCBL), Institut Camille Jordan (ICJ), École Centrale de Lyon (ECL), Université de Lyon-Université de Lyon-Université Claude Bernard Lyon 1 (UCBL), Université de Lyon-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université Jean Monnet - Saint-Étienne (UJM)-Centre National de la Recherche Scientifique (CNRS), Probabilités, statistique, physique mathématique (PSPM), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université Jean Monnet - Saint-Étienne (UJM)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Lyon (ECL), Modelling, Observations, Identification for Environmental Sciences (MOISE), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Jean Kuntzmann (LJK), Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS), Méthodes d'Analyse Stochastique des Codes et Traitements Numériques (GdR MASCOT-NUM), Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Institut Camille Jordan [Villeurbanne] (ICJ), Université de Lyon-Université Jean Monnet [Saint-Étienne] (UJM)-Institut National des Sciences Appliquées de Lyon (INSA Lyon), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS), Institut National des Sciences Appliquées (INSA)-Université de Lyon-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Lyon (ECL), Maume-Deschamps, Véronique, Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS), Université de Lyon-Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Centre National de la Recherche Scientifique (CNRS)-École Centrale de Lyon (ECL), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Laboratoire Jean Kuntzmann (LJK), Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Centre National de la Recherche Scientifique (CNRS)-Inria Grenoble - Rhône-Alpes, and Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
- Subjects
[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,capital allocation ,[ QFIN.RM ] Quantitative Finance [q-fin]/Risk Management [q-fin.RM] ,[MATH.MATH-PR] Mathematics [math]/Probability [math.PR] ,risk indicators,dependent lines of business,capital allocation ,dependent lines of business ,risk indicators ,[QFIN.RM] Quantitative Finance [q-fin]/Risk Management [q-fin.RM] ,[QFIN.RM]Quantitative Finance [q-fin]/Risk Management [q-fin.RM] ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] - Abstract
International audience; In a multi-dimensional risk model with dependent lines of business, we propose to allocate capital with respect to the minimization of some risk indicators. These indicators are sums of expected penalties due to the insolvency of a branch while the global reserve is either positive or negative. Explicit formulas in the case of two branches are obtained for several models independent exponential, correlated Pareto). The asymptotic behavior (as the initial capital goes to infinity) is studied. For higher dimension and several periods, no explicit expression is available. Using a stochastic algorithm, we get estimations of the allocation, compare the different allocations and study the impact of dependence.
- Published
- 2014
12. Efficient and fast estimation of the geometric median in Hilbert spaces with an averaged stochastic gradient algorithm
- Author
-
Pierre-André Zitt, Hervé Cardot, Peggy Cénac, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), and Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS )
- Subjects
Statistics and Probability ,Clustering high-dimensional data ,Robbins–Monro algorithm ,online algorithms ,high dimension ,CLT ,Asymptotic distribution ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,geometric quantiles ,01 natural sciences ,MSC 2010 : 62L20 (62G05, 62G35, 60B12,68W27) ,010104 statistics & probability ,symbols.namesake ,$L^{1}$-median ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Robustness (computer science) ,FOS: Mathematics ,Robbins-Monro algorithm ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,Differentiable function ,0101 mathematics ,Online algorithm ,L1-median ,functional data ,Mathematics ,Probability (math.PR) ,spatial median ,010102 general mathematics ,Hilbert space ,Estimator ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,Geometric median ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,symbols ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] ,Algorithm ,Mathematics - Probability ,recursive estimation - Abstract
International audience; With the progress of measurement apparatus and the development of automatic sensors it is not unusual anymore to get thousands of samples of observations taking values in high dimension spaces such as functional spaces. In such large samples of high dimensional data, outlying curves may not be uncommon and even a few individuals may corrupt simple statistical indicators such as the mean trajectory. We focus here on the estimation of the geometric median which is a direct generalization of the real median and has nice robustness properties. The geometric median being defined as the minimizer of a simple convex functional that is differentiable everywhere when the distribution has no atoms, it is possible to estimate it with online gradient algorithms. Such algorithms are very fast and can deal with large samples. Furthermore they also can be simply updated when the data arrive sequentially. We state the almost sure consistency and the L2 rates of convergence of the stochastic gradient estimator as well as the asymptotic normality of its averaged version. We get that the asymptotic distribution of the averaged version of the algorithm is the same as the classic estimators which are based on the minimization of the empirical loss function. The performances of our averaged sequential estimator, both in terms of computation speed and accuracy of the estimations, are evaluated with a small simulation study. Our approach is also illustrated on a sample of more 5000 individual television audiences measured every second over a period of 24 hours.
- Published
- 2013
13. A fast and recursive algorithm for clustering large datasets with k-medians
- Author
-
Hervé Cardot, Jean-Marie Monnez, Peggy Cénac, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS), Institut Élie Cartan de Nancy (IECN), Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS), Biology, genetics and statistics (BIGS), Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Henri Poincaré - Nancy 1 (UHP)-Université Nancy 2-Institut National Polytechnique de Lorraine (INPL)-Centre National de la Recherche Scientifique (CNRS)-INRIA Lorraine, Institut National de Recherche en Informatique et en Automatique (Inria), BIGS, Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Probabilités et statistiques, Institut Élie Cartan de Lorraine ( IECL ), Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ) -Université de Lorraine ( UL ) -Centre National de la Recherche Scientifique ( CNRS ), Institut Élie Cartan de Nancy ( IECN ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université Henri Poincaré - Nancy 1 ( UHP ) -Université Nancy 2-Institut National Polytechnique de Lorraine ( INPL ) -Centre National de la Recherche Scientifique ( CNRS ), Biology, genetics and statistics ( BIGS ), Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université Henri Poincaré - Nancy 1 ( UHP ) -Université Nancy 2-Institut National Polytechnique de Lorraine ( INPL ) -Centre National de la Recherche Scientifique ( CNRS ) -Institut National de Recherche en Informatique et en Automatique ( Inria ) -Université Henri Poincaré - Nancy 1 ( UHP ) -Université Nancy 2-Institut National Polytechnique de Lorraine ( INPL ) -Centre National de la Recherche Scientifique ( CNRS ) -INRIA Lorraine, Institut National de Recherche en Informatique et en Automatique ( Inria ), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique de Lorraine (INPL)-Université Nancy 2-Université Henri Poincaré - Nancy 1 (UHP)-Institut National de Recherche en Informatique et en Automatique (Inria), and Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique de Lorraine (INPL)-Université Nancy 2-Université Henri Poincaré - Nancy 1 (UHP)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique de Lorraine (INPL)-Université Nancy 2-Université Henri Poincaré - Nancy 1 (UHP)-Institut National de Recherche en Informatique et en Automatique (Inria)-INRIA Lorraine
- Subjects
Statistics and Probability ,Clustering high-dimensional data ,FOS: Computer and information sciences ,Mathematical optimization ,high dimensional data ,Machine Learning (stat.ML) ,02 engineering and technology ,Stochastic approximation ,01 natural sciences ,Statistics - Computation ,010104 statistics & probability ,k-medoids ,Statistics - Machine Learning ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,stochastic approximation ,0202 electrical engineering, electronic engineering, information engineering ,Computational statistics ,recursive estimators ,Almost surely ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,0101 mathematics ,Cluster analysis ,Computation (stat.CO) ,Mathematics ,averaging ,Robbins Monro ,Applied Mathematics ,Estimator ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,stochastic gradient ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,Medoid ,Computational Mathematics ,Computational Theory and Mathematics ,online clustering ,020201 artificial intelligence & image processing ,partitioning around medoids ,Algorithm - Abstract
Clustering with fast algorithms large samples of high dimensional data is an important challenge in computational statistics. Borrowing ideas from MacQueen (1967) who introduced a sequential version of the $k$-means algorithm, a new class of recursive stochastic gradient algorithms designed for the $k$-medians loss criterion is proposed. By their recursive nature, these algorithms are very fast and are well adapted to deal with large samples of data that are allowed to arrive sequentially. It is proved that the stochastic gradient algorithm converges almost surely to the set of stationary points of the underlying loss criterion. A particular attention is paid to the averaged versions, which are known to have better performances, and a data-driven procedure that allows automatic selection of the value of the descent step is proposed. The performance of the averaged sequential estimator is compared on a simulation study, both in terms of computation speed and accuracy of the estimations, with more classical partitioning techniques such as $k$-means, trimmed $k$-means and PAM (partitioning around medoids). Finally, this new online clustering technique is illustrated on determining television audience profiles with a sample of more than 5000 individual television audiences measured every minute over a period of 24 hours., Comment: Under revision for Computational Statistics and Data Analysis
- Published
- 2012
14. Context Trees, Variable Length Markov Chains and Dynamical Sources
- Author
-
Nicolas Pouyanne, Peggy Cénac, Frédéric Paccaut, Brigitte Chauvin, Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Mathématiques de Versailles ( LMV ), Université Paris-Saclay-Centre National de la Recherche Scientifique ( CNRS ) -Université de Versailles Saint-Quentin-en-Yvelines ( UVSQ ), Algorithms ( ALGORITHMS ), Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique ( Inria ) -Institut National de Recherche en Informatique et en Automatique ( Inria ), Laboratoire Amiénois de Mathématique Fondamentale et Appliquée ( LAMFA ), Université de Picardie Jules Verne ( UPJV ) -Centre National de la Recherche Scientifique ( CNRS ), Catherine Donati-Martin and Antoine Lejay and Alain Rouault, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Laboratoire de Mathématiques de Versailles (LMV), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Algorithms (ALGORITHMS), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Laboratoire Amiénois de Mathématique Fondamentale et Appliquée - UMR CNRS 7352 (LAMFA), Université de Picardie Jules Verne (UPJV)-Centre National de la Recherche Scientifique (CNRS), and Université de Bourgogne (UB)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Discrete mathematics ,Pure mathematics ,Stationary distribution ,Markov chain ,010102 general mathematics ,Probabilistic dynamical sources ,Probabilistic logic ,Context (language use) ,Information theory ,Variable length Markov chains ,01 natural sciences ,Measure (mathematics) ,Occurrences of words ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,010104 statistics & probability ,symbols.namesake ,symbols ,Uniqueness ,Dynamical systems of the interval ,Dirichlet series ,0101 mathematics ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] ,Mathematics - Abstract
Infinite random sequences of letters can be viewed as stochastic chains or as strings produced by a source, in the sense of information theory. The relationship between Variable Length Markov Chains (VLMC) and probabilistic dynamical sources is studied. We establish a probabilistic frame for context trees and VLMC and we prove that any VLMC is a dynamical source for which we explicitly build the mapping. On two examples, the "comb" and the "bamboo blossom", we find a necessary and sufficient condition for the existence and the uniqueness of a stationary probability measure for the VLMC. These two examples are detailed in order to provide the associated Dirichlet series as well as the generating functions of word occurrences.
- Published
- 2012
15. Recursive estimation of the conditional geometric median in Hilbert spaces
- Author
-
Peggy Cénac, Hervé Cardot, Pierre-André Zitt, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), and Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS )
- Subjects
Statistics and Probability ,Mallows-Wasserstein distance ,Robbins-Monro ,asymptotic normality ,CLT ,central limit theorem ,Asymptotic distribution ,Mathematics - Statistics Theory ,Statistics Theory (math.ST) ,01 natural sciences ,Mallows–Wasserstein distance ,online data ,010104 statistics & probability ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,60F05 ,FOS: Mathematics ,Applied mathematics ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,0101 mathematics ,62L20 ,Mathematics ,averaging ,Sequential estimation ,010102 general mathematics ,Estimator ,Robbins–Monro ,Conditional probability distribution ,[STAT.TH]Statistics [stat]/Statistics Theory [stat.TH] ,Geometric median ,stochastic gradient ,[ STAT.TH ] Statistics [stat]/Statistics Theory [stat.TH] ,robust estimator ,Rate of convergence ,Convergence of random variables ,Stochastic gradient ,kernel regression ,sequential estimation ,Kernel regression ,Statistics, Probability and Uncertainty - Abstract
International audience; A recursive estimator of the conditional geometric median in Hilbert spaces is studied. It is based on a stochastic gradient algorithm whose aim is to minimize a weighted L1 criterion and is consequently well adapted for robust online estimation. The weights are controlled by a kernel function and an associated bandwidth. Almost sure convergence and L2 rates of convergence are proved under general conditions on the conditional distribution as well as the sequence of descent steps of the algorithm and the sequence of bandwidths. Asymptotic normality is also proved for the averaged version of the algorithm with an optimal rate of convergence. A simulation study confirms the interest of this new and fast algorithm when the sample sizes are large. Finally, the ability of these recursive algorithms to deal with very high-dimensional data is illustrated on the robust estimation of television audience profiles conditional on the total time spent watching television over a period of 24 hours.
- Published
- 2012
16. Uncommon Suffix Tries
- Author
-
Nicolas Pouyanne, Peggy Cénac, Brigitte Chauvin, Frédéric Paccaut, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Laboratoire de Mathématiques de Versailles (LMV), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Laboratoire Amiénois de Mathématique Fondamentale et Appliquée (LAMFA), Université de Picardie Jules Verne (UPJV)-Centre National de la Recherche Scientifique (CNRS), Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Laboratoire de Mathématiques de Versailles ( LMV ), Université Paris-Saclay-Centre National de la Recherche Scientifique ( CNRS ) -Université de Versailles Saint-Quentin-en-Yvelines ( UVSQ ), Laboratoire Amiénois de Mathématique Fondamentale et Appliquée ( LAMFA ), and Université de Picardie Jules Verne ( UPJV ) -Centre National de la Recherche Scientifique ( CNRS )
- Subjects
FOS: Computer and information sciences ,Compressed suffix array ,Polynomial ,Logarithm ,General Mathematics ,Suffix tree ,variable length Markov chain ,[INFO.INFO-DS]Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,Generalized suffix tree ,probabilistic source ,0102 computer and information sciences ,02 engineering and technology ,suffix trie ,01 natural sciences ,law.invention ,Combinatorics ,law ,Computer Science - Data Structures and Algorithms ,Trie ,FOS: Mathematics ,0202 electrical engineering, electronic engineering, information engineering ,Data Structures and Algorithms (cs.DS) ,Mixing (physics) ,[ INFO.INFO-DS ] Computer Science [cs]/Data Structures and Algorithms [cs.DS] ,Mathematics ,Discrete mathematics ,Applied Mathematics ,Probability (math.PR) ,020206 networking & telecommunications ,Computer Graphics and Computer-Aided Design ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,010201 computation theory & mathematics ,mixing properties ,60J05, 37E05 ,Suffix ,[ MATH.MATH-PR ] Mathematics [math]/Probability [math.PR] ,Mathematics - Probability ,Software - Abstract
Common assumptions on the source producing the words inserted in a suffix trie with $n$ leaves lead to a $\log n$ height and saturation level. We provide an example of a suffix trie whose height increases faster than a power of $n$ and another one whose saturation level is negligible with respect to $\log n$. Both are built from VLMC (Variable Length Markov Chain) probabilistic sources; they are easily extended to families of sources having the same properties. The first example corresponds to a ''logarithmic infinite comb'' and enjoys a non uniform polynomial mixing. The second one corresponds to a ''factorial infinite comb'' for which mixing is uniform and exponential.
- Published
- 2011
17. Digital search trees and chaos game representation
- Author
-
Nicolas Pouyanne, Brigitte Chauvin, Peggy Cénac, Stéphane Ginouillac, Institut de Mathématiques de Toulouse UMR5219 (IMT), Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS), Probability, modelling and evaluation of information processing systems (PREVAL), Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Laboratoire de Mathématiques de Versailles (LMV), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Université Toulouse Capitole (UT Capitole), Université de Toulouse (UT)-Université de Toulouse (UT)-Institut National des Sciences Appliquées - Toulouse (INSA Toulouse), Institut National des Sciences Appliquées (INSA)-Université de Toulouse (UT)-Institut National des Sciences Appliquées (INSA)-Université Toulouse - Jean Jaurès (UT2J), Université de Toulouse (UT)-Université Toulouse III - Paul Sabatier (UT3), and Université de Toulouse (UT)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
Statistics and Probability ,Markov process ,digital search trees ,MSC primary 60C05, 68R15 ,secondary 92D20, 05D40 ,92D20, 05D40 ,Quantitative Biology - Quantitative Methods ,01 natural sciences ,Combinatorics ,insertion depth ,010104 statistics & probability ,03 medical and health sciences ,symbols.namesake ,Tree (descriptive set theory) ,Random tree ,60C05, 68R15 ,FOS: Mathematics ,lengths of paths ,0101 mathematics ,Representation (mathematics) ,Quantitative Methods (q-bio.QM) ,030304 developmental biology ,Mathematics ,Discrete mathematics ,0303 health sciences ,Sequence ,CGR ,Digital search trees ,Probability (math.PR) ,Random trees ,Search tree ,strong convergence ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,asymptotic growth ,Distribution (mathematics) ,FOS: Biological sciences ,symbols ,Mathematics - Probability ,Computer Science::Formal Languages and Automata Theory ,height - Abstract
Version préliminaire (2006) d'un travail publié sous forme définitive (2009).; International audience; In this paper, we consider a possible representation of a DNA sequence in a quaternary tree, in which on can visualize repetitions of subwords. The CGR-tree turns a sequence of letters into a digital search tree (DST), obtained from the suffixes of the reversed sequence. Several results are known concerning the height and the insertion depth for DST built from i.i.d. successive sequences. Here, the successive inserted wors are strongly dependent. We give the asymptotic behaviour of the insertion depth and of the length of branches for the CGR-tree obtained from the suffixes of reversed i.i.d. or Markovian sequence. This behaviour turns out to be at first order the same one as in the case of independent words. As a by-product, asymptotic results on the length of longest runs in a Markovian sequence are obtained.
- Published
- 2009
18. Test on the Structure of Biological Sequences via Chaos Game Representation
- Author
-
Peggy Cénac
- Subjects
Statistics and Probability ,Discrete mathematics ,Markov chain ,Chaos game ,Computational Mathematics ,Chaos game representation ,Iterated function system ,Genetics ,Ergodic theory ,Martingale (probability theory) ,Molecular Biology ,Algorithm ,Randomness ,Mathematics ,Statistical hypothesis testing - Abstract
In this paper biological sequences are modelled by stationary ergodic sequences. A new family of statistical tests to characterize the randomness of the inputs is proposed and analyzed. Tests for independence and for the determination of the appropriate order of a Markov chain are constructed with the Chaos Game Representation (CGR), and applied to several genomes.
- Published
- 2005
19. Variable Length Markov Chains, Persistent Random Walks: a close encounter
- Author
-
Peggy Cénac, Frédéric Paccaut, Nicolas Pouyanne, Brigitte Chauvin, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB), Laboratoire de Mathématiques de Versailles (LMV), Université de Versailles Saint-Quentin-en-Yvelines (UVSQ)-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), Laboratoire Amiénois de Mathématique Fondamentale et Appliquée - UMR CNRS 7352 (LAMFA), Université de Picardie Jules Verne (UPJV)-Centre National de la Recherche Scientifique (CNRS), Paccaut, Frédéric, and université de Bourgogne, IMB
- Subjects
[MATH.MATH-PR] Mathematics [math]/Probability [math.PR] ,Property (philosophy) ,Markov chain ,010102 general mathematics ,Probability (math.PR) ,Close encounter ,Variable length ,Random walk ,01 natural sciences ,[MATH.MATH-PR]Mathematics [math]/Probability [math.PR] ,010104 statistics & probability ,FOS: Mathematics ,Point (geometry) ,Statistical physics ,0101 mathematics ,Mathematics - Probability ,Mathematics - Abstract
This is the story of the encounter between two worlds: the world of random walks and the world of Variable Length Markov Chains (VLMC). The meeting point turns around the semi-Markov property of underlying processes., Comment: Book chapter
20. Algorithmes stochastiques pour la statistique robuste en grande dimension
- Author
-
Godichon-Baggioni, Antoine, Institut de Mathématiques de Bourgogne [Dijon] ( IMB ), Université de Bourgogne ( UB ) -Centre National de la Recherche Scientifique ( CNRS ), Université de Bourgogne, Hervé Cardot, Peggy Cénac, Institut de Mathématiques de Bourgogne [Dijon] (IMB), Centre National de la Recherche Scientifique (CNRS)-Université de Franche-Comté (UFC), and Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université de Bourgogne (UB)
- Subjects
Stochastic Algorithms ,Algorithmes Stochastiques ,Algorithmes Récursifs ,Recursive Algorithms ,Statistique Robuste ,Algorithmes de Gradient Stochastiques ,Stochastic Gradient Algorithms ,Averaging ,[MATH.MATH-ST]Mathematics [math]/Statistics [math.ST] ,Moyennisation ,Grande Dimension ,Robust Statistics ,[ MATH.MATH-ST ] Mathematics [math]/Statistics [math.ST] ,Functional Data ,Données Fonctionnelles ,Geometric Median ,High Dimension ,Médiane Géométrique - Abstract
This thesis focus on stochastic algorithms in high dimension as well as their application in robust statistics. In what follows, the expression high dimension may be used when the the size of the studied sample is large or when the variables we consider take values in high dimensional spaces (not necessarily finite). In order to analyze these kind of data, it can be interesting to consider algorithms which are fast, which do not need to store all the data, and which allow to update easily the estimates. In large sample of high dimensional data, outliers detection is often complicated. Nevertheless, these outliers, even if they are not many, can strongly disturb simple indicators like the mean and the covariance. We will focus on robust estimates, which are not too much sensitive to outliers.In a first part, we are interested in the recursive estimation of the geometric median, which is a robust indicator of location which can so be preferred to the mean when a part of the studied data is contaminated. For this purpose, we introduce a Robbins-Monro algorithm as well as its averaged version, before building non asymptotic confidence balls for these estimates, and exhibiting their $L^{p}$ and almost sure rates of convergence.In a second part, we focus on the estimation of the Median Covariation Matrix (MCM), which is a robust dispersion indicator linked to the geometric median. Furthermore, if the studied variable has a symmetric law, this indicator has the same eigenvectors as the covariance matrix. This last property represent a real interest to study the MCM, especially for Robust Principal Component Analysis. We so introduce a recursive algorithm which enables us to estimate simultaneously the geometric median, the MCM, and its $q$ main eigenvectors. We give, in a first time, the strong consistency of the estimators of the MCM, before exhibiting their rates of convergence in quadratic mean.In a third part, in the light of the work on the estimates of the median and of the Median Covariation Matrix, we exhibit the almost sure and $L^{p}$ rates of convergence of averaged stochastic gradient algorithms in Hilbert spaces, with less restrictive assumptions than in the literature. Then, two applications in robust statistics are given: estimation of the geometric quantiles and application in robust logistic regression.In the last part, we aim to fit a sphere on a noisy points cloud spread around a complete or truncated sphere. More precisely, we consider a random variable with a truncated spherical distribution, and we want to estimate its center as well as its radius. In this aim, we introduce a projected stochastic gradient algorithm and its averaged version. We establish the strong consistency of these estimators as well as their rates of convergence in quadratic mean. Finally, the asymptotic normality of the averaged algorithm is given.; Cette thèse porte sur l'étude d'algorithmes stochastiques en grande dimension ainsi qu'à leur application en statistique robuste. Dans la suite, l'expression grande dimension pourra aussi bien signifier que la taille des échantillons étudiés est grande ou encore que les variables considérées sont à valeurs dans des espaces de grande dimension (pas nécessairement finie). Afin d'analyser ce type de données, il peut être avantageux de considérer des algorithmes qui soient rapides, qui ne nécessitent pas de stocker toutes les données, et qui permettent de mettre à jour facilement les estimations. Dans de grandes masses de données en grande dimension, la détection automatique de points atypiques est souvent délicate. Cependant, ces points, même s'ils sont peu nombreux, peuvent fortement perturber des indicateurs simples tels que la moyenne ou la covariance. On va se concentrer sur des estimateurs robustes, qui ne sont pas trop sensibles aux données atypiques. Dans une première partie, on s'intéresse à l'estimation récursive de la médiane géométrique, un indicateur de position robuste, et qui peut donc être préférée à la moyenne lorsqu'une partie des données étudiées est contaminée. Pour cela, on introduit un algorithme de Robbins-Monro ainsi que sa version moyennée, avant de construire des boules de confiance non asymptotiques et d'exhiber leurs vitesses de convergence $L^{p}$ et presque sûre.La deuxième partie traite de l'estimation de la "Median Covariation Matrix" (MCM), qui est un indicateur de dispersion robuste lié à la médiane, et qui, si la variable étudiée suit une loi symétrique, a les mêmes sous-espaces propres que la matrice de variance-covariance. Ces dernières propriétés rendent l'étude de la MCM particulièrement intéressante pour l'Analyse en Composantes Principales Robuste. On va donc introduire un algorithme itératif qui permet d'estimer simultanément la médiane géométrique et la MCM ainsi que les $q$ principaux vecteurs propres de cette dernière. On donne, dans un premier temps, la forte consistance des estimateurs de la MCM avant d'exhiber les vitesses de convergence en moyenne quadratique.Dans une troisième partie, en s'inspirant du travail effectué sur les estimateurs de la médiane et de la "Median Covariation Matrix", on exhibe les vitesses de convergence presque sûre et $L^{p}$ des algorithmes de gradient stochastiques et de leur version moyennée dans des espaces de Hilbert, avec des hypothèses moins restrictives que celles présentes dans la littérature. On présente alors deux applications en statistique robuste: estimation de quantiles géométriques et régression logistique robuste.Dans la dernière partie, on cherche à ajuster une sphère sur un nuage de points répartis autour d'une sphère complète où tronquée. Plus précisément, on considère une variable aléatoire ayant une distribution sphérique tronquée, et on cherche à estimer son centre ainsi que son rayon. Pour ce faire, on introduit un algorithme de gradient stochastique projeté et son moyenné. Sous des hypothèses raisonnables, on établit leurs vitesses de convergence en moyenne quadratique ainsi que la normalité asymptotique de l'algorithme moyenné.
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.