14 results on '"Geenens, Gery"'
Search Results
2. Towards a universal representation of statistical dependence
- Author
-
Geenens, Gery
- Subjects
Mathematics - Statistics Theory ,Statistics - Methodology - Abstract
Dependence is undoubtedly a central concept in statistics. Though, it proves difficult to locate in the literature a formal definition which goes beyond the self-evident 'dependence = non-independence'. This absence has allowed the term 'dependence' and its declination to be used vaguely and indiscriminately for qualifying a variety of disparate notions, leading to numerous incongruities. For example, the classical Pearson's, Spearman's or Kendall's correlations are widely regarded as 'dependence measures' of major interest, in spite of returning 0 in some cases of deterministic relationships between the variables at play, evidently not measuring dependence at all. Arguing that research on such a fundamental topic would benefit from a slightly more rigid framework, this paper suggests a general definition of the dependence between two random variables defined on the same probability space. Natural enough for aligning with intuition, that definition is still sufficiently precise for allowing unequivocal identification of a 'universal' representation of the dependence structure of any bivariate distribution. Links between this representation and familiar concepts are highlighted, and ultimately, the idea of a dependence measure based on that universal representation is explored and shown to satisfy Renyi's postulates.
- Published
- 2023
3. Hellinger-Bhattacharyya cross-validation for shape-preserving multivariate wavelet thresholding
- Author
-
Aya-Moreno, Carlos, Geenens, Gery, and Penev, Spiridon
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
The benefits of the wavelet approach for density estimation are well established in the literature, especially when the density to estimate is irregular or heterogeneous in smoothness. However, wavelet density estimates are typically not bona fide densities. In Aya-Moreno et al (2018), a `shape-preserving' wavelet density estimator was introduced, including as main step the estimation of the square-root of the density. A natural concept involving square-root of densities is the Hellinger distance - or equivalently, the Bhattacharyya affinity coefficient. In this paper, we deliver a fully data-driven version of the above 'shape-preserving' wavelet density estimator, where all user-defined parameters, such as resolution level or thresholding specifications, are selected by optimising an original leave-one-out version of the Hellinger-Bhattacharyya criterion. The theoretical optimality of the proposed procedure is established, while simulations show the strong practical performance of the estimator. Within that framework, we also propose a novel but natural 'jackknife thresholding' scheme, which proves superior to other, more classical thresholding options.
- Published
- 2022
4. Statistical depth in abstract metric spaces
- Author
-
Geenens, Gery, Nieto-Reyes, Alicia, and Francisci, Giacomo
- Subjects
Statistics - Methodology - Abstract
The concept of depth has proved very important for multivariate and functional data analysis, as it essentially acts as a surrogate for the notion a ranking of observations which is absent in more than one dimension. Motivated by the rapid development of technology, in particular the advent of `Big Data', we extend here that concept to general metric spaces, propose a natural depth measure and explore its properties as a statistical depth function. Working in a general metric space allows the depth to be tailored to the data at hand and to the ultimate goal of the analysis, a very desirable property given the polymorphic nature of modern data sets. This flexibility is thoroughly illustrated by several real data analyses.
- Published
- 2021
5. An essay on copula modelling for discrete random vectors; or how to pour new wine into old bottles
- Author
-
Geenens, Gery
- Subjects
Statistics - Methodology - Abstract
Copulas have now become ubiquitous statistical tools for describing, analysing and modelling dependence between random variables. Sklar's theorem, "the fundamental theorem of copulas", makes a clear distinction between the continuous case and the discrete case, though. In particular, the copula of a discrete random vector is not identifiable, which causes serious inconsistencies. In spite of this, downplaying statements are widespread in the related literature, and copula methods are used for modelling dependence between discrete variables. This paper calls to reconsidering the soundness of copula modelling for discrete data. It suggests a more fundamental construction which allows copula ideas to smoothly carry over to the discrete case. Actually it is an attempt at rejuvenating some century-old ideas of Udny Yule, who mentioned a similar construction a long time before copulas got in fashion.
- Published
- 2019
6. The Hellinger Correlation
- Author
-
Geenens, Gery and de Micheaux, Pierre Lafaye
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
In this paper, the defining properties of a valid measure of the dependence between two random variables are reviewed and complemented with two original ones, shown to be more fundamental than other usual postulates. While other popular choices are proved to violate some of these requirements, a class of dependence measures satisfying all of them is identified. One particular measure, that we call the Hellinger correlation, appears as a natural choice within that class due to both its theoretical and intuitive appeal. A simple and efficient nonparametric estimator for that quantity is proposed. Synthetic and real-data examples finally illustrate the descriptive ability of the measure, which can also be used as test statistic for exact independence testing.
- Published
- 2018
7. A nonparametric copula approach to conditional Value-at-Risk
- Author
-
Geenens, Gery and Dunn, Richard
- Subjects
Statistics - Methodology ,Quantitative Finance - Statistical Finance ,Statistics - Applications - Abstract
Value-at-Risk and its conditional allegory, which takes into account the available information about the economic environment, form the centrepiece of the Basel framework for the evaluation of market risk in the banking sector. In this paper, a new nonparametric framework for estimating this conditional Value-at-Risk is presented. A nonparametric approach is particularly pertinent as the traditionally used parametric distributions have been shown to be insufficiently robust and flexible in most of the equity-return data sets observed in practice. The method extracts the quantile of the conditional distribution of interest, whose estimation is based on a novel estimator of the density of the copula describing the dynamic dependence observed in the series of returns. Real-world back-testing analyses demonstrate the potential of the approach, whose performance may be superior to its industry counterparts.
- Published
- 2017
8. Shape-preserving wavelet-based multivariate density estimation
- Author
-
Moreno, Carlos Aya, Geenens, Gery, and Penev, Spiridon
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
Wavelet estimators for a probability density f enjoy many good properties, however they are not "shape-preserving" in the sense that the final estimate may not be non-negative or integrate to unity. A solution to negativity issues may be to estimate first the square-root of f and then square this estimate up. This paper proposes and investigates such an estimation scheme, generalising to higher dimensions some previous constructions which are valid only in one dimension. The estimation is mainly based on nearest-neighbour-balls. The theoretical properties of the proposed estimator are obtained, and it is shown to reach the optimal rate of convergence uniformly over large classes of densities under mild conditions. Simulations show that the new estimator performs as well in general as the classical wavelet estimator, while automatically producing estimates which are bona fide densities.
- Published
- 2017
9. Mellin-Meijer-kernel density estimation on $\mathbb{R}^+$
- Author
-
Geenens, Gery
- Subjects
Mathematics - Statistics Theory ,Statistics - Methodology - Abstract
Nonparametric kernel density estimation is a very natural procedure which simply makes use of the smoothing power of the convolution operation. Yet, it performs poorly when the density of a positive variable is to be estimated (boundary issues, spurious bumps in the tail). So various extensions of the basic kernel estimator allegedly suitable for $\mathbb{R}^+$-supported densities, such as those using Gamma or other asymmetric kernels, abound in the literature. Those, however, are not based on any valid smoothing operation analogous to the convolution, which typically leads to inconsistencies. By contrast, in this paper a kernel estimator for $\mathbb{R}^+$-supported densities is defined by making use of the Mellin convolution, the natural analogue of the usual convolution on $\mathbb{R}^+$. From there, a very transparent theory flows and leads to new type of asymmetric kernels strongly related to Meijer's $G$-functions. The numerous pleasant properties of this `Mellin-Meijer-kernel density estimator' are demonstrated in the paper. Its pointwise and $L_2$-consistency (with optimal rate of convergence) is established for a large class of densities, including densities unbounded at 0 and showing power-law decay in their right tail. Its practical behaviour is investigated further through simulations and some real data analyses.
- Published
- 2017
10. Robust analysis of second-leg home advantage in UEFA football through better nonparametric confidence intervals for binary regression functions
- Author
-
Geenens, Gery and Cuddihy, Thomas
- Subjects
Statistics - Methodology ,Statistics - Applications - Abstract
In international football (soccer), two-legged knockout ties, with each team playing at home in one leg and the final outcome decided on aggregate, are common. Many players, managers and followers seem to believe in the `second-leg home advantage', i.e. that it is beneficial to play at home on the second leg. A more complex effect than the usual and well-established home advantage, it is harder to identify, and previous statistical studies did not prove conclusive about its actuality. Yet, given the amount of money handled in international football competitions nowadays, the question of existence or otherwise of this effect is of real import. As opposed to previous research, this paper addresses it from a purely nonparametric perspective and brings a very objective answer, not based on any particular model specification which could orientate the analysis in one or the other direction. Along the way, the paper reviews the well-known shortcomings of the Wald confidence interval for a proportion, suggests new nonparametric confidence intervals for conditional probability functions, revisits the problem of the bias when building confidence intervals in nonparametric regression, and provides a novel bootstrap-based solution to it. Finally, the new intervals are used in a careful analysis of game outcome data for the UEFA Champions and Europa leagues from 2009/10 to 2014/15. A slight `second-leg home advantage' is evidenced.
- Published
- 2017
11. Local-likelihood transformation kernel density estimation for positive random variables
- Author
-
Geenens, Gery and Wang, Craig
- Subjects
Statistics - Methodology - Abstract
The kernel estimator is known not to be adequate for estimating the density of a positive random variable X. The main reason is the well-known boundary bias problems that it suffers from, but also its poor behaviour in the long right tail that such a density typically exhibits. A natural approach to this problem is to first estimate the density of the logarithm of X, and obtaining an estimate of the density of X using standard results on functions of random variables (`back-transformation'). Although intuitive, the basic application of this idea yields very poor results, as was documented earlier in the literature. In this paper, the main reason for this underachievement is identified, and an easy fix is suggested. It is demonstrated that combining the transformation with local likelihood density estimation methods produces very good estimators of R+-supported densities, not only close to the boundary, but also in the right tail. The asymptotic properties of the proposed `local likelihood transformation kernel density estimators' are derived for a generic transformation, not only for the logarithm, which allows one to consider other transformations as well. One of them, called the `probex' transformation, is given more focus. Finally, the excellent behaviour of those estimators in practice is evidenced through a comprehensive simulation study and the analysis of several real data sets. A nice consequence of articulating the method around local-likelihood estimation is that the resulting density estimates are typically smooth and visually pleasant, without oversmoothing important features of the underlying density.
- Published
- 2016
12. Probit transformation for nonparametric kernel estimation of the copula density
- Author
-
Geenens, Gery, Charpentier, Arthur, and Paindaveine, Davy
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
Copula modelling has become ubiquitous in modern statistics. Here, the problem of nonparametrically estimating a copula density is addressed. Arguably the most popular nonparametric density estimator, the kernel estimator is not suitable for the unit-square-supported copula densities, mainly because it is heavily affected by boundary bias issues. In addition, most common copulas admit unbounded densities, and kernel methods are not consistent in that case. In this paper, a kernel-type copula density estimator is proposed. It is based on the idea of transforming the uniform marginals of the copula density into normal distributions via the probit function, estimating the density in the transformed domain, which can be accomplished without boundary problems, and obtaining an estimate of the copula density through back-transformation. Although natural, a raw application of this procedure was, however, seen not to perform very well in the earlier literature. Here, it is shown that, if combined with local likelihood density estimation methods, the idea yields very good and easy to implement estimators, fixing boundary issues in a natural way and able to cope with unbounded copula densities. The asymptotic properties of the suggested estimators are derived, and a practical way of selecting the crucially important smoothing parameters is devised. Finally, extensive simulation studies and a real data analysis evidence their excellent performance compared to their main competitors.
- Published
- 2014
13. Probit transformation for kernel density estimation on the unit interval
- Author
-
Geenens, Gery
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
Kernel estimation of a probability density function supported on the unit interval has proved difficult, because of the well known boundary bias issues a conventional kernel density estimator would necessarily face in this situation. Transforming the variable of interest into a variable whose density has unconstrained support, estimating that density, and obtaining an estimate of the density of the original variable through back-transformation, seems a natural idea to easily get rid of the boundary problems. In practice, however, a simple and efficient implementation of this methodology is far from immediate, and the few attempts found in the literature have been reported not to perform well. In this paper, the main reasons for this failure are identified and an easy way to correct them is suggested. It turns out that combining the transformation idea with local likelihood density estimation produces viable density estimators, mostly free from boundary issues. Their asymptotic properties are derived, and a practical cross-validation bandwidth selection rule is devised. Extensive simulations demonstrate the excellent performance of these estimators compared to their main competitors for a wide range of density shapes. In fact, they turn out to be the best choice overall. Finally, they are used to successfully estimate a density of non-standard shape supported on $[0,1]$ from a small-size real data sample.
- Published
- 2013
14. A Nonparametric Measure of Local Association for two-way Contingency Tables
- Author
-
Hui, Francis K. C. and Geenens, Gery
- Subjects
Statistics - Methodology ,Mathematics - Statistics Theory - Abstract
In contingency table analysis, the odds ratio is a commonly applied measure used to summarize the degree of association between two categorical variables, say R and S. Suppose now that for each individual in the table, a vector of continuous variables X is also observed. It is then vital to analyze whether and how the degree of association varies with X. In this work, we extend the classical odds ratio to the conditional case, and develop nonparametric estimators of this "pointwise odds ratio" to summarize the strength of local association between R and S given X. To allow for maximum flexibility, we make this extension using kernel regression. We develop confidence intervals based on these nonparametric estimators. We demonstrate via simulation that our pointwise odds ratio estimators can outperform model-based counterparts from logistic regression and GAMs, without the need for a linearity or additivity assumption. Finally, we illustrate its application to a dataset of patients from an intensive care unit (ICU), offering a greater insight into how the association between survival of patients admitted for emergency versus elective reasons varies with the patients' ages.
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.