Start Over

HyperSIGMA: Hyperspectral Intelligence Comprehension Foundation Model

Authors :: Wang, Di
Hu, Meiqi
Jin, Yao
Miao, Yuchun
Yang, Jiaqi
Xu, Yichu
Qin, Xiaolei
Ma, Jiaqi
Sun, Lingyu
Li, Chenxing
Fu, Chuan
Chen, Hongruixuan
Han, Chengxi
Yokoya, Naoto
Zhang, Jing
Xu, Minqiang
Liu, Lin
Zhang, Lefei
Wu, Chen
Du, Bo
Tao, Dacheng
Zhang, Liangpei
Publication Year :: 2024
Abstract: Foundation models (FMs) are revolutionizing the analysis and understanding of remote sensing (RS) scenes, including aerial RGB, multispectral, and SAR images. However, hyperspectral images (HSIs), which are rich in spectral information, have not seen much application of FMs, with existing methods often restricted to specific tasks and lacking generality. To fill this gap, we introduce HyperSIGMA, a vision transformer-based foundation model for HSI interpretation, scalable to over a billion parameters. To tackle the spectral and spatial redundancy challenges in HSIs, we introduce a novel sparse sampling attention (SSA) mechanism, which effectively promotes the learning of diverse contextual features and serves as the basic block of HyperSIGMA. HyperSIGMA integrates spatial and spectral features using a specially designed spectral enhancement module. In addition, we construct a large-scale hyperspectral dataset, HyperGlobal-450K, for pre-training, which contains about 450K hyperspectral images, significantly surpassing existing datasets in scale. Extensive experiments on various high-level and low-level HSI tasks demonstrate HyperSIGMA's versatility and superior representational capability compared to current state-of-the-art methods. Moreover, HyperSIGMA shows significant advantages in scalability, robustness, cross-modal transferring capability, and real-world applicability.<br />Comment: The code and models will be released at https://github.com/WHU-Sigma/HyperSIGMA