Start Over

AlignYOLO: A feature-aligned network for object detection.

Authors :: Li, Hao
Song, Changming
Cheng, Dongxu
Li, Zenghui
Wu, Caihong
Chen, Kang
Source :: Expert Systems with Applications. Jul2024, Vol. 246, pN.PAG-N.PAG. 1p.
Publication Year :: 2024
Abstract: Aggregating features at various levels or scales has been empirically demonstrated to enhance feature representations in object detection. However, existing approaches tend to aggregate features or embed contextual information indiscriminately through simple concatenation or addition, which disregards the misalignment resulting from repeated sampling operations. This paper proposes a feature-aligned network based on YOLOv5 to address the misalignment issues, namely AlignYOLO. The network consists of three primary modules: the self-attention convolution (SAC) module, the feature aggregation and alignment (FAA) module, and the multiscale aligned channel attention (MSACA) module. Firstly, the SAC module comprehensively extracts information by simultaneously employing both convolution and self-attention. Secondly, the FAA module aggregates features across layers and aligns them through the adoption of a learnable interpolation strategy. Lastly, the MSACA module employs multiscale convolution to capture contextual information. The in-layer features are aligned with the learnable interpolation strategy. Additionally, channel attention is leveraged to enhance feature representations. Extensive experiments are conducted on benchmark datasets to evaluate the effectiveness of the proposed method, where AlignYOLO outperforms state-of-the-art detectors. [ABSTRACT FROM AUTHOR]