Back to Search Start Over

SegCFT: Context-aware Fourier Transform for efficient semantic segmentation.

Authors :
Zhang, Yinqi
Jiang, Lingfu
Chen, Fuhai
Xie, Jiao
Zhang, Baochang
He, Gaoqi
Lin, Shaohui
Source :
Neurocomputing. Sep2024, Vol. 596, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

Semantic segmentation has been one of the most critical tasks in computer vision. Recent works mainly focus on improving segmentation performance by designing high-capacity transformer architectures. They try to solve the high data consumption and computing costs required for model training and deployment in the cloud, but the high computation overhead still makes it difficult to be directly applied to limited resource devices. In this paper, we propose a novel fast Fourier Transform (FFT) based Context-aware Feature Mixer under the transformer-like architecture for precise and efficient semantic segmentation, called SegCFT. Different from the self-attention-based transformer, SegCFT uses a Hierarchical Fourier Transform (HFT) to reduce computational cost via non-parametric calculation and promote segmentation performance by fusing the channel-wise and pixel-wise contexts. To integrate the features from the frequency domain of DFT into the spatial domain of the transformer-like architecture, an Adaptive Modulation Unit (AMU) is designed to modulate the frequency-domain features and ensure consistency between the frequency domain and the spatial domain. Experimental evaluation on two semantic segmentation benchmarks, ADE20k and Cityscapes, shows that SegCFT achieves competitive segmentation performance, while the training and inference costs are superior to the previous methods. • Context-aware Fourier Transform is proposed to replace self-attention for segmentation. • Hierarchical Fourier Transform fuses the token-channel contexts to reduce computation cost. • Adaptive Modulation Unit fuses the spatial-frequency features for performance improvement. • SegCFT achieves the best trade-off between segmentation and inference speed. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09252312
Volume :
596
Database :
Academic Search Index
Journal :
Neurocomputing
Publication Type :
Academic Journal
Accession number :
178502504
Full Text :
https://doi.org/10.1016/j.neucom.2024.127946