1. Performing Group Difference Testing on Graph Structured Data From GANs: Analysis and Applications in Neuroimaging
- Author
-
Zhichun Huang, Won Hwa Kim, Akshay Mishra, Vikas Singh, Tuan Quang Dinh, Sathya N. Ravi, Tien N. Vo, and Yunyang Xiong
- Subjects
Computer science ,Neuroimaging ,02 engineering and technology ,Machine learning ,computer.software_genre ,Article ,Empirical research ,Artificial Intelligence ,Simple (abstract algebra) ,Image Processing, Computer-Assisted ,0202 electrical engineering, electronic engineering, information engineering ,Null distribution ,Humans ,Statistical hypothesis testing ,Complement (set theory) ,Spectral graph theory ,Group (mathematics) ,business.industry ,Applied Mathematics ,Brain ,Computational Theory and Mathematics ,020201 artificial intelligence & image processing ,Neural Networks, Computer ,Computer Vision and Pattern Recognition ,Artificial intelligence ,business ,computer ,Algorithms ,Software - Abstract
Generative adversarial networks (GANs) have emerged as a powerful generative model in computer vision. Given their impressive abilities in generating highly realistic images, they are also being used in novel ways in applications in the life sciences. This raises an interesting question when GANs are used in scientific or biomedical studies. Consider the setting where we are restricted to only using the samples from a trained GAN for downstream group difference analysis (and do not have direct access to the real data). Will we obtain similar conclusions? In this work, we explore if “generated” data, i.e., sampled from such GANs can be used for performing statistical group difference tests in cases versus controls studies, common across many scientific disciplines. We provide a detailed analysis describing regimes where this may be feasible. We complement the technical results with an empirical study focused on the analysis of cortical thickness on brain mesh surfaces in an Alzheimer’s disease dataset. To exploit the geometric nature of the data, we use simple ideas from spectral graph theory to show how adjustments to existing GANs can yield improvements. We also give a generalization error bound by extending recent results on Neural Network Distance. To our knowledge, our work offers the first analysis assessing whether the Null distribution in “healthy versus diseased subjects” type statistical testing using data generated from the GANs coincides with the one obtained from the same analysis with real data. The code is available at https://github.com/yyxiongzju/GLapGAN.
- Published
- 2022