Back to Search Start Over

Data fission: splitting a single data point.

Authors :
Leiner, James
Duan, Boyan
Wasserman, Larry
Ramdas, Aaditya
Source :
Journal of the American Statistical Association. Oct2023, p1-22. 22p. 11 Illustrations.
Publication Year :
2023

Abstract

Abstract Suppose we observe a random vector <italic>X</italic> from some distribution in a known family with unknown parameters. We ask the following question: when is it possible to split <italic>X</italic> into two pieces <italic>f</italic>(<italic>X</italic>) and <italic>g</italic>(<italic>X</italic>) such that neither part is sufficient to reconstruct X by itself, but both together can recover X fully, and their joint distribution is tractable? One common solution to this problem when multiple samples of X are observed is data splitting, but Rasines and Young (2022) offers an alternative approach that uses additive Gaussian noise — this enables post-selection inference in finite samples for Gaussian distributed data and asymptotically for non-Gaussian additive models. In this paper, we offer a more general methodology for achieving such a split in finite samples by borrowing ideas from Bayesian inference to yield a (frequentist) solution that can be viewed as a continuous analog of data splitting. We call our method data fission, as an alternative to data splitting, data carving and p-value masking. We exemplify the method on several prototypical applications, such as post-selection inference for trend filtering and other regression problems, and effect size estimation after interactive multiple testing. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01621459
Database :
Academic Search Index
Journal :
Journal of the American Statistical Association
Publication Type :
Academic Journal
Accession number :
173008573
Full Text :
https://doi.org/10.1080/01621459.2023.2270748