Back to Search Start Over

Noise-tolerant RGB-D feature fusion network for outdoor fruit detection.

Authors :
Sun, Qixin
Chai, Xiujuan
Zeng, Zhikang
Zhou, Guomin
Sun, Tan
Source :
Computers & Electronics in Agriculture. Jul2022, Vol. 198, pN.PAG-N.PAG. 1p.
Publication Year :
2022

Abstract

• Proposing an end-to-end noise-tolerant feature fusion network to integrate outdoor RGB-D information. • Designing an attention-based fusion module to eliminate the negative influence of depth noise and make full use of the complementary multi-modal features. • Learning object position from multi-scale features from color images and fusion modules, further improving the network noise immunity. • Achieving accurate fruit detection in outdoor scenes with a real-time speed. In the process of farm automation, fruit detection is the basis and guarantee for yield prediction, automatic picking, and other orchard operations. RGB images can only obtain the two-dimensional information of the scene, which is not sufficient to effectively distinguish fruits that are dense growth and occlusion by branches and leaves. With the development of depth sensors, using RGB-D images with more complementary information can boost the performance of fruit detection. However, due to the nature of sensors and scene configurations, the quality of outdoor depth images is poor, posing a challenge when fusing RGB-D features. Therefore, this paper proposes an end-to-end RGB-D object detection network, termed as noise-tolerant feature fusion network (NT-FFN), to utilize the outdoor multi-modal data properly and improve the detection accuracy. Specifically, the NT-FFN first uses two structurally identical feature extractors to extract single-modal (color and depth) features, which is the base of the subsequent feature fusion. Then, to avoid introducing too much depth noise and focus the perception on the important part of the features, an attention-based fusion module is designed to adaptively fuse the multi-modal features. Finally, multi-scale features from the color images and the fusion modules are used to predict object position, which not only improves the network's ability to detect multi-scale objects but also further enhances the noise immunity of the network. In addition, this paper constructs an RGB-D citrus fruit dataset, which contributes to comprehensively evaluating the proposed network. Evaluation metrics on the dataset show that the NT-FFN achieves an AP50 of 95.4% with a real-time speed, which outperforms single-modal methods, common multi-modal fusion strategies, and advanced multi-modal detection methods. The proposed NT-FFN also achieves excellent detection results in other fruit detection tasks, which verifies its generalization ability. This study provides the possibility and foundation for performing multi-modal information fusion in outdoor fruit detection. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
01681699
Volume :
198
Database :
Academic Search Index
Journal :
Computers & Electronics in Agriculture
Publication Type :
Academic Journal
Accession number :
157498569
Full Text :
https://doi.org/10.1016/j.compag.2022.107034