Back to Search Start Over

Controlling the False Split Rate in Tree-Based Aggregation.

Authors :
Shao, Simeng
Bien, Jacob
Javanmard, Adel
Source :
Journal of the American Statistical Association. Jul2024, p1-22. 22p. 7 Illustrations.
Publication Year :
2024

Abstract

AbstractIn many domains, data measurements can naturally be associated with the leaves of a tree, expressing the relationships among these measurements. For example, companies belong to industries, which in turn belong to ever coarser divisions such as sectors; microbes are commonly arranged in a taxonomic hierarchy from species to kingdoms; street blocks belong to neighborhoods, which in turn belong to larger-scale regions. The problem of tree-based aggregation that we consider in this paper asks which of these tree-defined subgroups of leaves should really be treated as a single entity and which of these entities should be distinguished from each other.We introduce the <italic>false split rate</italic>, an error measure that describes the degree to which subgroups have been split when they should not have been. While expressible as the false discovery rate in a special case, we show that these measures can be quite different for the general tree structures common in our setting. We then propose a multiple hypothesis testing algorithm for tree-based aggregation, which we prove controls this error measure. We focus on two main examples of tree-based aggregation, one which involves aggregating means and the other which involves aggregating regression coefficients. [ABSTRACT FROM AUTHOR]

Subjects

Subjects :
*FALSE discovery rate
*URBAN trees

Details

Language :
English
ISSN :
01621459
Database :
Academic Search Index
Journal :
Journal of the American Statistical Association
Publication Type :
Academic Journal
Accession number :
178329418
Full Text :
https://doi.org/10.1080/01621459.2024.2376285