Back to Search Start Over

Point cloud semantic segmentation with adaptive spatial structure graph transformer

Authors :
Ting Han
Yiping Chen
Jin Ma
Xiaoxue Liu
Wuming Zhang
Xinchang Zhang
Huajuan Wang
Source :
International Journal of Applied Earth Observations and Geoinformation, Vol 133, Iss , Pp 104105- (2024)
Publication Year :
2024
Publisher :
Elsevier, 2024.

Abstract

With the rapid development of LiDAR and artificial intelligence technologies, 3D point cloud semantic segmentation has become a highlight research topic. This technology is able to significantly enhance the capabilities of building information modeling, navigation and environmental perception. However, current deep learning-based methods primarily rely on voxelization or multi-layer convolution for feature extraction. These methods often face challenges in effectively differentiating between homogeneous objects or structurally adherent targets in complex real-world scenes. To this end, we propose a Graph Transformer point cloud semantic segmentation network (ASGFormer) tailored for structurally adherent objects. Firstly, ASGFormer combines Graph and Transformer to promote global correlation understanding in the graph. Secondly, spatial index and position embedding are constructed based on distance relationships and feature differences. Through a learnable mechanism, the structural weights between points are dynamically adjusted, achieving adaptive spatial structure within the graph. Finally, dummy nodes are introduced to facilitate global information storage and transmission between layers, effectively addressing the issue of information loss at the terminal nodes of the graph. Comprehensive experiments are conducted on the various real-world 3D point cloud datasets, analyzing the effectiveness of proposed ASGFormer through qualitative and quantitative evaluations. ASGFormer outperforms existing approaches with of 91.3% for OA, 78.0% for mAcc, and 72.3% for mIoU on S3DIS dataset. Moreover, ASGFormer achieves 72.8%, 45.5%, 81.6%, 70.1% mIoU on ScanNet, City-Facade, Toronto 3D and Semantic KITTI dataset, respectively. Notably, the proposed method demonstrates effective differentiation of homogeneous structurally adherent objects, further contributing to the intelligent perception and modeling of complex scenes.

Details

Language :
English
ISSN :
15698432
Volume :
133
Issue :
104105-
Database :
Directory of Open Access Journals
Journal :
International Journal of Applied Earth Observations and Geoinformation
Publication Type :
Academic Journal
Accession number :
edsdoj.9ebccd5acd104e239f477ce475219a0c
Document Type :
article
Full Text :
https://doi.org/10.1016/j.jag.2024.104105