Back to Search
Start Over
Multi-Modal and Multi-Scale Fusion 3D Object Detection of 4D Radar and LiDAR for Autonomous Driving
- Source :
- IEEE Transactions on Vehicular Technology; 2023, Vol. 72 Issue: 5 p5628-5641, 14p
- Publication Year :
- 2023
-
Abstract
- Multi-modal fusion overcomes the inherent limitations of single-sensor perception in 3D object detection of autonomous driving. The fusion of 4D Radar and LiDAR can boost the detection range and more robust. Nevertheless, different data characteristics and noise distributions between two sensors hinder performance improvement when directly integrating them. Therefore, we are the first to propose a novel fusion method termed <inline-formula><tex-math notation="LaTeX">$M^{2}$</tex-math></inline-formula>-Fusion for 4D Radar and LiDAR, based on Multi-modal and Multi-scale fusion. To better integrate two sensors, we propose an Interaction-based Multi-Modal Fusion (IMMF) method utilizing a self-attention mechanism to learn features from each modality and exchange intermediate layer information. Specific to the current single-resolution voxel division's precision and efficiency balance problem, we also put forward a Center-based Multi-Scale Fusion (CMSF) method to first regress the center points of objects and then extract features in multiple resolutions. Furthermore, we present a data preprocessing method based on Gaussian distribution that effectively decreases data noise to reduce errors caused by point cloud divergence of 4D Radar data in the <inline-formula><tex-math notation="LaTeX">$x$</tex-math></inline-formula>-<inline-formula><tex-math notation="LaTeX">$z$</tex-math></inline-formula> plane. To evaluate the proposed fusion method, a series of experiments were conducted using the Astyx HiRes 2019 dataset, including the calibrated 4D Radar and 16-line LiDAR data. The results demonstrated that our fusion method compared favorably with state-of-the-art algorithms. When compared to PointPillars, our method achieves mAP (mean average precision) increases of 5.64<inline-formula><tex-math notation="LaTeX">$\%$</tex-math></inline-formula> and 13.57<inline-formula><tex-math notation="LaTeX">$\%$</tex-math></inline-formula> for 3D and BEV (bird's eye view) detection of the car class at a moderate level, respectively.
Details
- Language :
- English
- ISSN :
- 00189545
- Volume :
- 72
- Issue :
- 5
- Database :
- Supplemental Index
- Journal :
- IEEE Transactions on Vehicular Technology
- Publication Type :
- Periodical
- Accession number :
- ejs63073785
- Full Text :
- https://doi.org/10.1109/TVT.2022.3230265