Back to Search Start Over

Lightweight detection network based on receptive-field feature enhancement convolution and three dimensions attention for images captured by UAVs.

Authors :
Song, Tingting
Zhang, Xin
Yang, Degang
Ye, Yichen
Liu, Chen
Zhou, Jie
Song, Yingze
Source :
Image & Vision Computing. Dec2023, Vol. 140, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

UAV sampling can not only adapt to various complex terrain environments but also provide a broader vision. However, images captured by UAVs usually contain complex backgrounds and a large number of small objects. This poses a significant challenge to some existing advanced object detectors. Moreover, some existing state-of-the-art lightweight detectors have too many parameters and computational overheads, which are not friendly to lightweight devices. Responding to the above issues, we propose a single-stage detector named features enhancement and shift lightweight network in this work. Firstly, a lightweight adjust convolution is proposed, which unfolds the features and encodes the 3 × 3 background information into information-rich 1 × 1 features by averaging the pooling and convolution layers, which efficiently enhances the representation of 1 × 1 convolutional extracted features. Next, to efficiently suppress complex background information, we propose a three-dimensions attention module, which interacts information on the C-W, C-H and H-W dimensions in a unique way to obtain three efficient attention maps that highlight important information to weaken irrelevant information. Moreover, we create a novel receptive-field feature enhancement convolution, which unfolds the features and then interacts the 3 × 3 features to obtain weighted weights. The 3 × 3 convolution combining weighted features becomes parametric unshared convolution in principle, which enhances the ability to capture detailed information. Finally, in order to retain richer object and semantic information, we carefully analyze the down-sampling convolution and propose a feature shift down-sampling convolution. Then we combine it and improve Neck to get a new lightweight Neck. Furthermore, experiments on the VisDrone-DET2021 dataset show that our method obtained 36.21% on mAP50, which is 9.78% higher than the baseline model YOLOv5n. Meanwhile, compared with the advanced lightweight networks YOLOX-tiny, YOLOv6n, YOLOv7-tiny, and YOLOv8n, our network achieves superior detection results using fewer number of parameters. We also compare our network with the latest networks trained on images captured by UAVs, and experimentally demonstrate that our network achieves excellent performance using only 1.7 M parameters and 8.3 GFLOPS. • A lightweight adjustment convolution is proposed, which can improve the extracted benefit of the current feature. • An effective three dimensions attention module is proposed that focus on more detail information. • A feature shift down-sampling convolution is proposed, which compensates for the information loss of network. • An efficient receptive-field feature enhancement convolution is designed. It can be seen as a novel convolutional operation. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02628856
Volume :
140
Database :
Academic Search Index
Journal :
Image & Vision Computing
Publication Type :
Academic Journal
Accession number :
174034701
Full Text :
https://doi.org/10.1016/j.imavis.2023.104855