51. Multiscale analysis of count data through topic alignment.
- Author
-
Fukuyama, Julia, Sankaran, Kris, and Symul, Laura
- Subjects
- *
DATA analysis - Abstract
Topic modeling is a popular method used to describe biological count data. With topic models, the user must specify the number of topics |$K$|. Since there is no definitive way to choose |$K$| and since a true value might not exist, we develop a method, which we call topic alignment , to study the relationships across models with different |$K$|. In addition, we present three diagnostics based on the alignment. These techniques can show how many topics are consistently present across different models, if a topic is only transiently present, or if a topic splits into more topics when |$K$| increases. This strategy gives more insight into the process of generating the data than choosing a single value of |$K$| would. We design a visual representation of these cross-model relationships, show the effectiveness of these tools for interpreting the topics on simulated and real data, and release an accompanying R package, alto [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF