Back to Search Start Over

ContextNet: Leveraging Comprehensive Contextual Information for Enhanced 3D Object Detection

Authors :
Caiyan Pei
Shuai Zhang
Lijun Cao
Liqiang Zhao
Source :
IEEE Access, Vol 12, Pp 106744-106756 (2024)
Publication Year :
2024
Publisher :
IEEE, 2024.

Abstract

The progress in object detection for autonomous driving using LiDAR point cloud data has been remarkable. However, current voxel-based two-stage detectors have not fully capitalized on the wealth of contextual information present in the point cloud data. Typically, Voxel Feature Encoding (VFE) layers tend to focus exclusively on internal voxel information, neglecting the broader context. Additionally, the process of extracting 3D proposal features through Region of Interest (ROI) spatial quantization and pooling downsampling results in a loss of spatial detail within the proposed regions. This limitation in capturing contextual details presents challenges for accurate object detection and positioning, particularly over long distances. In this paper, we propose ContextNet, which leverages comprehensive contextual information for enhanced 3D object detection. Specifically, it comprises two modules: the Voxel Self-Attention Encoding module (VSAE) and the Joint Channel Self-Attention Re-weight module (JCSR). VSAE establishes dependencies between voxels through self-attention, expanding the receptive field and introducing substantial contextual information. JCSR employs joint attention to extract both local channel information and global context information from the raw point cloud within the RoI region. By integrating these two sets of information and re-weighting the point features, the 3D proposal is refined, enabling a more accurate estimation of the object’s position and confidence. Extensive experiments conducted on the KITTI dataset demonstrate that our approach outperforms voxel-based two-stage methods, particularly with a 9.5% improvement in the mAP compared to the baseline on the nuScenes test dataset, and an improved 1.61% hard AP compared to the baseline on the KITTI benchmark.

Details

Language :
English
ISSN :
21693536
Volume :
12
Database :
Directory of Open Access Journals
Journal :
IEEE Access
Publication Type :
Academic Journal
Accession number :
edsdoj.8cd33fa11586446ab4537de1cf7c220f
Document Type :
article
Full Text :
https://doi.org/10.1109/ACCESS.2024.3437642