Back to Search
Start Over
Controlling the False Split Rate in Tree-Based Aggregation.
- Source :
-
Journal of the American Statistical Association . Jul2024, p1-22. 22p. 7 Illustrations. - Publication Year :
- 2024
-
Abstract
- AbstractIn many domains, data measurements can naturally be associated with the leaves of a tree, expressing the relationships among these measurements. For example, companies belong to industries, which in turn belong to ever coarser divisions such as sectors; microbes are commonly arranged in a taxonomic hierarchy from species to kingdoms; street blocks belong to neighborhoods, which in turn belong to larger-scale regions. The problem of tree-based aggregation that we consider in this paper asks which of these tree-defined subgroups of leaves should really be treated as a single entity and which of these entities should be distinguished from each other.We introduce the <italic>false split rate</italic>, an error measure that describes the degree to which subgroups have been split when they should not have been. While expressible as the false discovery rate in a special case, we show that these measures can be quite different for the general tree structures common in our setting. We then propose a multiple hypothesis testing algorithm for tree-based aggregation, which we prove controls this error measure. We focus on two main examples of tree-based aggregation, one which involves aggregating means and the other which involves aggregating regression coefficients. [ABSTRACT FROM AUTHOR]
- Subjects :
- *FALSE discovery rate
*URBAN trees
Subjects
Details
- Language :
- English
- ISSN :
- 01621459
- Database :
- Academic Search Index
- Journal :
- Journal of the American Statistical Association
- Publication Type :
- Academic Journal
- Accession number :
- 178329418
- Full Text :
- https://doi.org/10.1080/01621459.2024.2376285