1. Frequency and Texture Aware Multi-Domain Feature Fusion for Remote Sensing Scene Classification
- Author
-
Russo Ashraf and Kang-Hyun Jo
- Subjects
Convolutional neural networks (CNNs) ,frequency analysis ,large kernel attention (LKA) ,multi domain ,remote sensing (RS) ,scene classification ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Remote Sensing (RS) scene classification, a pivotal task in Earth observation, involves categorizing satellite or aerial imagery into distinct land-use and land-cover classes. Major challenges in this task include high intraclass variability and low interclass distinctions. Historically, state-of-the-art methods in this field have struggled to achieve satisfactory results without a significant trade-off in computational efficiency. These methods often require substantial computational resources to process the complex data characteristics of RS imagery, leading to inefficiencies that limit their practical application in real-time or on resource-constrained platforms. Delving into these complexities, the Efficient Spectral Inception Former (ESIF) architecture is proposed, which introduces a novel paradigm to RS scene classification by integrating multi-domain feature fusion of the spatial, texture, and spectral (frequency) domains. The proposed approach leverages the strengths of convolutional neural networks (CNNs) for spatial information extraction, a novel texture feature alignment block (TFAB) for nuanced texture differentiation, an efficient spectro-former block (ESFB) that uses spectral analysis for enhanced pattern recognition, a cross-domain fusion block (CDFB) and finally, an inception transformer block (iFB) that balances high and low-frequency information. Furthermore, we construct a new remote scene dataset named ISL-RS50, which is significantly more challenging than the existing ones. The proposed method yield the best results when trained from scratch, in all seven tested datasets:: ISL-RS50 (60%), Optimal-31 (86.55%), UC-Merced (94.52%), RSSCN7 (94.1%), SIRI-WHU (95%), WHU-RS19 (94.52%), AID (93.5%). Finally, ESIF exemplifies an optimal accuracy-efficiency trade-off, supporting its suitability for deployment in real-world applications.
- Published
- 2025
- Full Text
- View/download PDF