1. Object-Level Semantic Map Construction for Dynamic Scenes
- Author
-
Xujie Kang, Chen Xu, Hongdeng Jian, Xiangtao Fan, and Jing Li
- Subjects
0209 industrial biotechnology ,Computer science ,Association (object-oriented programming) ,dynamic simultaneous localization and mapping ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,02 engineering and technology ,Simultaneous localization and mapping ,lcsh:Technology ,lcsh:Chemistry ,020901 industrial engineering & automation ,Motion estimation ,0202 electrical engineering, electronic engineering, information engineering ,General Materials Science ,Computer vision ,Instrumentation ,lcsh:QH301-705.5 ,object tracking ,Fluid Flow and Transfer Processes ,business.industry ,lcsh:T ,Process Chemistry and Technology ,Frame (networking) ,General Engineering ,Object (computer science) ,Object detection ,lcsh:QC1-999 ,Computer Science Applications ,lcsh:Biology (General) ,lcsh:QD1-999 ,lcsh:TA1-2040 ,Video tracking ,instance segmentation ,Object model ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,lcsh:Engineering (General). Civil engineering (General) ,lcsh:Physics ,camera pose tracking - Abstract
Visual simultaneous localization and mapping (SLAM) is challenging in dynamic environments as moving objects can impair camera pose tracking and mapping. This paper introduces a method for robust dense bject-level SLAM in dynamic environments that takes a live stream of RGB-D frame data as input, detects moving objects, and segments the scene into different objects while simultaneously tracking and reconstructing their 3D structures. This approach provides a new method of dynamic object detection, which integrates prior knowledge of the object model database constructed, object-oriented 3D tracking against the camera pose, and the association between the instance segmentation results on the current frame data and an object database to find dynamic objects in the current frame. By leveraging the 3D static model for frame-to-model alignment, as well as dynamic object culling, the camera motion estimation reduced the overall drift. According to the camera pose accuracy and instance segmentation results, an object-level semantic map representation was constructed for the world map. The experimental results obtained using the TUM RGB-D dataset, which compares the proposed method to the related state-of-the-art approaches, demonstrating that our method achieves similar performance in static scenes and improved accuracy and robustness in dynamic scenes.
- Published
- 2021
- Full Text
- View/download PDF