194 results on '"6D pose estimation"'
Search Results
2. EdgePose: An Edge Attention Network for 6D Pose Estimation.
- Author
-
Feng, Qi, Nong, Jian, and Liang, Yanyan
- Subjects
- *
DEEP learning , *ACCURACY of information - Abstract
We propose a 6D pose estimation method that introduces an edge attention mechanism into the bidirectional feature fusion network. Our method constructs an end-to-end network model by sharing weights between the edge detection encoder and the encoder of the RGB branch in the feature fusion network, effectively utilizing edge information and improving the accuracy and robustness of 6D pose estimation. Experimental results show that this method achieves an accuracy of nearly 100% on the LineMOD dataset, and it also achieves state-of-the-art performance on the YCB-V dataset, especially on objects with significant edge information. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. A RGB-D feature fusion network for occluded object 6D pose estimation.
- Author
-
Song, Yiwei and Tang, Chunhui
- Abstract
6D pose estimation using RGB-D data has been widely utilized in various scenarios, with keypoint-based methods receiving significant attention due to their exceptional performance. However, these methods still face numerous challenges, especially when the object is heavily occluded or truncated. To address this issue, we propose a novel cross-modal fusion network. Specifically, our approach initially employs object detection to identify the potential position of the object and randomly samples within this region. Subsequently, a specially designed feature extraction network is utilized to extract appearance features from the RGB image and geometry features from the depth image respectively; these features are then implicitly aggregated through cross-modal fusion. Finally, keypoints are employed for estimating the pose of the object. The proposed method undergoes extensive testing on Occlusion Linemod and Truncation Linemod datasets. Experimental results demonstrate that our method has made significant advancements, thereby validating the effectiveness of cross-modal feature fusion strategy in enhancing the accuracy of RGB-D image pose estimation based on keypoints. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Deep Learning-Based Real-Time 6D Pose Estimation and Multi-Mode Tracking Algorithms for Citrus-Harvesting Robots.
- Author
-
Hwang, Hyun-Jung, Cho, Jae-Hoon, and Kim, Yong-Tae
- Subjects
AGRICULTURAL robots ,DEEP learning ,FRUIT harvesting ,VIRTUAL reality ,TRACKING algorithms - Abstract
In the agricultural sector, utilizing robots for tasks such as fruit harvesting poses significant challenges, particularly in achieving accurate 6D pose estimation of the target objects, which is essential for precise and efficient harvesting. Particularly, fruit harvesting relies heavily on manual labor, leading to issues with an unstable labor supply and rising costs. To solve these problems, agricultural harvesting robots are gaining attention. However, effective harvesting necessitates accurate 6D pose estimation of the target object. This study proposes a method to enhance the performance of fruit-harvesting robots, including the development of a dataset named HWANGMOD, which was created using both virtual and real environments with tools such as Blender and BlenderProc. Additionally, we present methods for training an EfficientPose-based model for 6D pose estimation and ripeness classification, and an algorithm for determining the optimal harvest sequence among multiple fruits. Finally, we propose a multi-object tracking method using coordinates estimated by deep learning models to improve the robot's performance in dynamic environments. The proposed methods were evaluated using metrics such as A D D and A D D S , showing that the deep learning model for agricultural harvesting robots excelled in accuracy, robustness, and real-time processing. These advancements contribute to the potential for commercialization of agricultural harvesting robots and the broader field of agricultural automation technology. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. 感兴趣区域与距离图像融合的点云基准标记检测系统.
- Author
-
刘博文, 曾 碧, and 刘建圻
- Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
6. A lightweight method of pose estimation for indoor object.
- Author
-
Wang, Sijie, Li, Yifei, Chen, Diansheng, Li, Jiting, and Zhang, Xiaochuan
- Subjects
- *
FIX-point estimation , *POINT cloud , *OPTICAL rotation , *COST , *MOBILE robots , *STORAGE - Abstract
Due to the multiple types of objects and the uncertainty of their geometric structures and scales in indoor scenes, the position and pose estimation of point clouds of indoor objects by mobile robots has the problems of domain gap, high learning cost, and high computing cost. In this paper, a lightweight 6D pose estimation method is proposed, which decomposes the pose estimation into a viewpoint and the in-plane rotation around the optical axis of the viewpoint, and the improved PointNet + + network structure and two lightweight modules are used to construct a codebook, and the 6d pose estimation of the point cloud of the indoor objects is completed by building and querying the codebook. The model was trained on the ShapeNetV2 dataset, and reports the ADD-S metric validation on the YCB-Video and LineMOD datasets, reaching 97.0% and 94.6% respectively. The experiment shows that the model can be trained to estimate the 6d position and pose of the unknown object point cloud with lower computation and storage cost, and the model with fewer parameters and better real-time performance is superior to other high-recision methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Weakly Supervised Pose Estimation of Surgical Instrument from a Single Endoscopic Image.
- Author
-
Hu, Lihua, Feng, Shida, and Wang, Bo
- Subjects
- *
POSE estimation (Computer vision) , *SURGICAL instruments , *COMPUTER-assisted surgery , *IMAGE segmentation , *SUPERVISED learning , *INSTRUMENTAL variables (Statistics) - Abstract
Instrument pose estimation is a key demand in computer-aided surgery, and its main challenges lie in two aspects: Firstly, the difficulty of obtaining stable corresponding image feature points due to the instruments' high refraction and complicated background, and secondly, the lack of labeled pose data. This study aims to tackle the pose estimation problem of surgical instruments in the current endoscope system using a single endoscopic image. More specifically, a weakly supervised method based on the instrument's image segmentation contour is proposed, with the effective assistance of synthesized endoscopic images. Our method consists of the following three modules: a segmentation module to automatically detect the instrument in the input image, followed by a point inference module to predict the image locations of the implicit feature points of the instrument, and a point back-propagatable Perspective-n-Point module to estimate the pose from the tentative 2D–3D corresponding points. To alleviate the over-reliance on point correspondence accuracy, the local errors of feature point matching and the global inconsistency of the corresponding contours are simultaneously minimized. Our proposed method is validated with both real and synthetic images in comparison with the current state-of-the-art methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. A Robust CoS-PVNet Pose Estimation Network in Complex Scenarios.
- Author
-
Yong, Jiu, Lei, Xiaomei, Dang, Jianwu, and Wang, Yangping
- Subjects
AUGMENTED reality ,POSE estimation (Computer vision) ,VIRTUAL reality ,AUTONOMOUS vehicles ,ALGORITHMS ,ROBOTICS - Abstract
Object 6D pose estimation, as a key technology in applications such as augmented reality (AR), virtual reality (VR), robotics, and autonomous driving, requires the prediction of the 3D position and 3D pose of objects robustly from complex scene images. However, complex environmental factors such as occlusion, noise, weak texture, and lighting changes may affect the accuracy and robustness of object 6D pose estimation. We propose a robust CoS-PVNet (complex scenarios pixel-wise voting network) pose estimation network for complex scenes. By adding a pixel-weight layer based on the PVNet network, more accurate pixel point vectors are selected, and dilated convolution and adaptive weighting strategies are used to capture local and global contextual information of the input feature map. At the same time, the perspective-n-point localization algorithm is used to accurately locate 2D key points to solve the pose of 6D objects, and then, the transformation relationship matrix of 6D pose projection is solved. The research results indicate that on the LineMod and Occlusion LineMod datasets, CoS-PVNet has high accuracy and can achieve stable and robust 6D pose estimation even in complex scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Real-Time 6-DoF Object Pose Estimation Network Based on 2D-3D Coordinate Correspondence
- Author
-
Li, Shuai, Chen, Jinlong, Yang, Minghao, Su, Jianhua, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Yadav, Sanjay, editor, Arya, Yogendra, editor, Pandey, Shailesh M., editor, Gherabi, Noredine, editor, and Karras, Dimitrios A., editor
- Published
- 2024
- Full Text
- View/download PDF
10. RTFT6D: A Real-Time 6D Pose Estimation with Fusion Transformer
- Author
-
Zhang, Qianwen, Zhang, Li, Dai, Cen, Huang, Huan, Liu, Liaoxue, Guo, Jian, Guo, Yu, Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Tan, Kay Chen, Series Editor, Qu, Yi, editor, Gu, Mancang, editor, Niu, Yifeng, editor, and Fu, Wenxing, editor
- Published
- 2024
- Full Text
- View/download PDF
11. NMPose: Leveraging Normal Maps for 6D Pose Estimation
- Author
-
Liao, Wenhua, Pei, Songwei, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Luo, Biao, editor, Cheng, Long, editor, Wu, Zheng-Guang, editor, Li, Hongyi, editor, and Li, Chaojie, editor
- Published
- 2024
- Full Text
- View/download PDF
12. DON6D: a decoupled one-stage network for 6D pose estimation
- Author
-
Zheng Wang, Hangyao Tu, Yutong Qian, and Yanwei Zhao
- Subjects
6D pose estimation ,Deep learning ,Real-time method ,Medicine ,Science - Abstract
Abstract The six-dimensional (6D) pose object estimation is a key task in robotic manipulation and grasping scenes. Many existing two-stage solutions with a slow inference speed require extra refinement to handle the challenges of variations in lighting, sensor noise, object occlusion, and truncation. To address these challenges, this work proposes a decoupled one-stage network (DON6D) model for 6D pose estimation that improves inference speed on the premise of maintaining accuracy. Particularly, since the RGB images are aligned with the RGB-D images, the proposed DON6D first uses a two-dimensional detection network to locate the interested objects in RGB-D images. Then, a module of feature extraction and fusion is used to extract color and geometric features fully. Further, dual data augmentation is performed to enhance the generalization ability of the proposed model. Finally, the features are fused, and an attention residual encoder–decoder, which can improve the pose estimation performance to obtain an accurate 6D pose, is introduced. The proposed DON6D model is evaluated on the LINEMOD and YCB-Video datasets. The results demonstrate that the proposed DON6D is superior to several state-of-the-art methods regarding the ADD(-S) and ADD(-S) AUC metrics.
- Published
- 2024
- Full Text
- View/download PDF
13. Context-aware 6D pose estimation of known objects using RGB-D data.
- Author
-
Kumar, Ankit, Shukla, Priya, Kushwaha, Vandana, and Nandi, Gora Chand
- Subjects
POSE estimation (Computer vision) ,COMPUTER vision ,ROBOTICS ,DEEP learning - Abstract
In the realm of computer vision and robotics, the pursuit of intelligent robotic grasping and accurate 6D object pose estimation has been a focal point of research. Many modern-world applications, such as robot grasping, manipulation, and palletizing, require the correct pose of objects present in a scene to perform their specific tasks. The estimation of a 6D object pose becomes even more challenging due to inherent complexities, especially when dealing with objects positioned within cluttered scenes and subjected to high levels of occlusion. While prior endeavors have made strides in addressing this issue, their accuracy falls short of the reliability demanded by real-world applications. In this research, we present an architecture that, unlike prior works, incorporates contextual awareness. This novel approach capitalizes on the contextual information attainable about the objects in question. The framework we propose takes a dissection approach, discerning objects by their intrinsic characteristics, namely whether they are symmetric or non-symmetric. Notably, our methodology employs a more profound estimator and refiner network tandem for non-symmetric objects, in contrast to symmetric ones. This distinction acknowledges the inherent dissimilarities between the two object types, thereby enhancing performance. Through experiments conducted on the LineMOD dataset, widely regarded as a benchmark for pose estimation in occluded and cluttered scenes, we demonstrate a notable improvement in accuracy of approximately 3.2% compared to the previous state-of-the-art method, DenseFusion. Moreover, our results indicate that the achieved inference time is sufficient for real-time usage. Overall, our proposed architecture leverages contextual information and tailors the pose estimation process based on object types, leading to enhanced accuracy and real-time performance in challenging scenarios. Code is available at GitHub link [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. 一种基于点云实例分割的六维位姿估计方法.
- Author
-
周剑
- Abstract
Copyright of Cyber Security & Data Governance is the property of Editorial Office of Information Technology & Network Security and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
15. DON6D: a decoupled one-stage network for 6D pose estimation.
- Author
-
Wang, Zheng, Tu, Hangyao, Qian, Yutong, and Zhao, Yanwei
- Subjects
- *
DATA augmentation , *FEATURE extraction , *DEEP learning - Abstract
The six-dimensional (6D) pose object estimation is a key task in robotic manipulation and grasping scenes. Many existing two-stage solutions with a slow inference speed require extra refinement to handle the challenges of variations in lighting, sensor noise, object occlusion, and truncation. To address these challenges, this work proposes a decoupled one-stage network (DON6D) model for 6D pose estimation that improves inference speed on the premise of maintaining accuracy. Particularly, since the RGB images are aligned with the RGB-D images, the proposed DON6D first uses a two-dimensional detection network to locate the interested objects in RGB-D images. Then, a module of feature extraction and fusion is used to extract color and geometric features fully. Further, dual data augmentation is performed to enhance the generalization ability of the proposed model. Finally, the features are fused, and an attention residual encoder–decoder, which can improve the pose estimation performance to obtain an accurate 6D pose, is introduced. The proposed DON6D model is evaluated on the LINEMOD and YCB-Video datasets. The results demonstrate that the proposed DON6D is superior to several state-of-the-art methods regarding the ADD(-S) and ADD(-S) AUC metrics. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. 6D Pose Estimation on Point Cloud Data through Prior Knowledge Integration: A Case Study in Autonomous Disassembly.
- Author
-
Wu, Chengzhi, Fu, Hao, Kaiser, Jan-Philipp, Barczak, Erik Tabuchi, Pfrommer, Julius, Lanza, Gisela, Heizmann, Michael, and Beyerer, Jürgen
- Abstract
The accurate estimation of 6D pose remains a challenging task within the computer vision domain, even when utilizing 3D point cloud data. Conversely, in the manufacturing domain, instances arise where leveraging prior knowledge can yield advancements in this endeavor. This study focuses on the disassembly of starter motors to augment the engineering of product life cycles. A pivotal objective in this context involves the identification and 6D pose estimation of bolts affixed to the motors, facilitating automated disassembly within the manufacturing workflow. Complicating matters, the presence of occlusions and the limitations of single-view data acquisition, notably when motors are placed in a clamping system, obscure certain portions and render some bolts imperceptible. Consequently, the development of a comprehensive pipeline capable of acquiring complete bolt information is imperative to avoid oversight in bolt detection. In this paper, employing the task of bolt detection within the scope of our project as a pertinent use case, we introduce a meticulously devised pipeline. This multi-stage pipeline effectively captures the 6D information with regard to all bolts on the motor, thereby showcasing the effective utilization of prior knowledge in handling this challenging task. The proposed methodology not only contributes to the field of 6D pose estimation but also underscores the viability of integrating domain-specific insights to tackle complex problems in manufacturing and automation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
17. A Review on Six Degrees of Freedom (6D) Pose Estimation for Robotic Applications
- Author
-
Chen Yuanwei, Mohd Hairi Mohd Zaman, and Mohd Faisal Ibrahim
- Subjects
Deep learning ,6D pose estimation ,point cloud ,robotic ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
With advancements in technology, deep learning has become increasingly widespread, particularly in fields like robot control, computer vision, and autonomous driving. In these areas, obtaining pose information of target objects, especially their spatial location, is crucial for robot grasping tasks. Although many effective implementations of six degrees of freedom (6D) pose estimation methods based on RGB images exist, challenges in this domain persist. This paper provides a comprehensive review of traditional 6D pose estimation methods, deep learning approaches, and point cloud techniques by analyzing their advantages and disadvantages. It also discusses evaluation metrics and performance on common datasets for 6D pose estimation. Furthermore, the paper offers a theoretical foundation for robot grasping and explores future directions for 6D pose estimation. Finally, it summarizes the current state and development trends of 6D pose estimation, aiming to help researchers better understand and learn about this field.
- Published
- 2024
- Full Text
- View/download PDF
18. Deep Learning-Based Real-Time 6D Pose Estimation and Multi-Mode Tracking Algorithms for Citrus-Harvesting Robots
- Author
-
Hyun-Jung Hwang, Jae-Hoon Cho, and Yong-Tae Kim
- Subjects
6D pose estimation ,deep learning ,dataset construction ,SORT ,tracking ,virtual environment ,Mechanical engineering and machinery ,TJ1-1570 - Abstract
In the agricultural sector, utilizing robots for tasks such as fruit harvesting poses significant challenges, particularly in achieving accurate 6D pose estimation of the target objects, which is essential for precise and efficient harvesting. Particularly, fruit harvesting relies heavily on manual labor, leading to issues with an unstable labor supply and rising costs. To solve these problems, agricultural harvesting robots are gaining attention. However, effective harvesting necessitates accurate 6D pose estimation of the target object. This study proposes a method to enhance the performance of fruit-harvesting robots, including the development of a dataset named HWANGMOD, which was created using both virtual and real environments with tools such as Blender and BlenderProc. Additionally, we present methods for training an EfficientPose-based model for 6D pose estimation and ripeness classification, and an algorithm for determining the optimal harvest sequence among multiple fruits. Finally, we propose a multi-object tracking method using coordinates estimated by deep learning models to improve the robot’s performance in dynamic environments. The proposed methods were evaluated using metrics such as ADD and ADDS, showing that the deep learning model for agricultural harvesting robots excelled in accuracy, robustness, and real-time processing. These advancements contribute to the potential for commercialization of agricultural harvesting robots and the broader field of agricultural automation technology.
- Published
- 2024
- Full Text
- View/download PDF
19. Enhancing Inter-AUV Perception: Adaptive 6-DOF Pose Estimation with Synthetic Images for AUV Swarm Sensing
- Author
-
Qingbo Wei, Yi Yang, Xingqun Zhou, Zhiqiang Hu, Yan Li, Chuanzhi Fan, Quan Zheng, and Zhichao Wang
- Subjects
Autonomous Underwater Vehicles (AUVs) ,6D pose estimation ,underwater perception ,environmental adaptation ,synthetic underwater images ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
The capabilities of AUV mutual perception and localization are crucial for the development of AUV swarm systems. We propose the AUV6D model, a synthetic image-based approach to enhance inter-AUV perception through 6D pose estimation. Due to the challenge of acquiring accurate 6D pose data, a dataset of simulated underwater images with precise pose labels was generated using Unity3D. Mask-CycleGAN technology was introduced to transform these simulated images into realistic synthetic images, addressing the scarcity of available underwater data. Furthermore, the Color Intermediate Domain Mapping strategy is proposed to ensure alignment across different image styles at pixel and feature levels, enhancing the adaptability of the pose estimation model. Additionally, the Salient Keypoint Vector Voting Mechanism was developed to improve the accuracy and robustness of underwater pose estimation, enabling precise localization even in the presence of occlusions. The experimental results demonstrated that our AUV6D model achieved millimeter-level localization precision and pose estimation errors within five degrees, showing exceptional performance in complex underwater environments. Navigation experiments with two AUVs further verified the model’s reliability for mutual 6D pose estimation. This research provides substantial technical support for more complex and precise collaborative operations for AUV swarms in the future.
- Published
- 2024
- Full Text
- View/download PDF
20. CMT-6D: a lightweight iterative 6DoF pose estimation network based on cross-modal Transformer
- Author
-
Liu, Suyi, Xu, Fang, Wu, Chengdong, Chi, Jianning, Yu, Xiaosheng, Wei, Longxing, and Leng, Chuanjiang
- Published
- 2024
- Full Text
- View/download PDF
21. Analysis of Optimization Techniques in 6D Pose Estimation Approaches using RGB Images on Multiple Objects with Occlusion.
- Author
-
Nugroho, Budi, Suciati, Nanik, and Fatichah, Chastine
- Subjects
MATHEMATICAL optimization ,DEEP learning ,POSE estimation (Computer vision) ,PROBLEM solving ,ANALYSIS of variance ,STATISTICAL significance - Abstract
6D pose estimation is very important for supporting future smart technologies. The previous methods show optimal performance on RGB-D images or single objects. However, the problem still occurs in RGB images or multiple objects with occlusion. This study focuses on solving the problem using a deep learning approach. One of the key components of deep learning is the optimization process, which we research to determine its effect on solving the problem. The research methodology includes implementing the optimization techniques in the methods, measuring loss value, measuring performance, observing experimental results, analyzing statistical significance, and comparing the performance of optimizers. We implement Adam, RMSprop, Adagrad, Adadelta, and SGD optimizers and analyze their effects on the EfficientPose and DPOD methods. We use the LineMod-Occluded dataset to measure the performance of the methods using the ADD metric. According to the experiment, the loss value is low and stable in the experimental scenarios with a number of epochs between 200 and 500. The performance is relatively high in those scenarios, where Adadelta's performance outperforms other optimizers on both methods. Based on the analysis of variance, the effect of optimizers on the performance of the methods is low, but the slight performance increase is significant in this case. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. EFN6D: an efficient RGB-D fusion network for 6D pose estimation.
- Author
-
Wang, Yaming, Jiang, Xiaoyan, Fujita, Hamido, Fang, Zhijun, Qiu, Xihe, and Chen, Jue
- Abstract
Precise 6DoF (6D) object pose estimation is an essential topic for many intelligent applications, for example, robot grasping, virtual reality, and autonomous driving. Lacking depth information, traditional pose estimators using only RGB cameras consistently predict bias 3D rotation and translation matrices. With the wide use of RGB-D cameras, we can directly capture both the depth for the object relative to the camera and the corresponding RGB image. Most existing methods concatenate these two data sources directly, which does not make full use of their complementary relationship. Therefore, we propose an efficient RGB-D fusion network for 6D pose estimation, called EFN6D, to exploit the 2D–3D feature more thoroughly. Instead of directly using the original single-channel depth map, we encode the depth information into a normal map and point cloud data. To effectively fuse the surface texture features and the geometric contour features of the object, we feed the RGB images and the normal map into two ResNets. Besides, the PSP modules and skip connections are used between the two ResNets, which not only enhances cross modal fusion performance of the network but also enhances the network's capability in handling objects at different scales. Finally, the fused features obtained from these two ResNets and the point cloud features are densely fused point by point to further strengthen the fusion of 2D and 3D information at a per-pixel level. Experiments on the LINEMOD and YCB-Video datasets show that our EFN6D outperforms state-of-the-art methods by a large margin. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. EdgePose: An Edge Attention Network for 6D Pose Estimation
- Author
-
Qi Feng, Jian Nong, and Yanyan Liang
- Subjects
6D pose estimation ,edge attention ,feature fusion ,deep learning ,mixed reality ,Mathematics ,QA1-939 - Abstract
We propose a 6D pose estimation method that introduces an edge attention mechanism into the bidirectional feature fusion network. Our method constructs an end-to-end network model by sharing weights between the edge detection encoder and the encoder of the RGB branch in the feature fusion network, effectively utilizing edge information and improving the accuracy and robustness of 6D pose estimation. Experimental results show that this method achieves an accuracy of nearly 100% on the LineMOD dataset, and it also achieves state-of-the-art performance on the YCB-V dataset, especially on objects with significant edge information.
- Published
- 2024
- Full Text
- View/download PDF
24. Distance-Aware Vector-Field and Vector Screening Strategy for 6D Object Pose Estimation
- Author
-
Wang, Lichun, Yang, Chao, Xin, Jianjia, Yin, Baocai, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Lu, Huchuan, editor, Ouyang, Wanli, editor, Huang, Hui, editor, Lu, Jiwen, editor, Liu, Risheng, editor, Dong, Jing, editor, and Xu, Min, editor
- Published
- 2023
- Full Text
- View/download PDF
25. Realtime 3D Reconstruction at Scale and Object Pose Estimation for Bin Picking System
- Author
-
Wang, Nianfeng, Lin, Weida, Lin, Junye, Zhang, Xianmin, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Yang, Huayong, editor, Liu, Honghai, editor, Zou, Jun, editor, Yin, Zhouping, editor, Liu, Lianqing, editor, Yang, Geng, editor, Ouyang, Xiaoping, editor, and Wang, Zhiyong, editor
- Published
- 2023
- Full Text
- View/download PDF
26. Expeditious Object Pose Estimation for Autonomous Robotic Grasping
- Author
-
Deevi, Sri Aditya, Mishra, Deepak, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Gupta, Deep, editor, Bhurchandi, Kishor, editor, Murala, Subrahmanyam, editor, Raman, Balasubramanian, editor, and Kumar, Sanjeev, editor
- Published
- 2023
- Full Text
- View/download PDF
27. 6D Object Pose Estimation Based on Cross-Modality Feature Fusion.
- Author
-
Jiang, Meng, Zhang, Liming, Wang, Xiaohua, Li, Shuang, and Jiao, Yijie
- Subjects
- *
OBJECT tracking (Computer vision) , *WEIGHT training , *CONVOLUTIONAL neural networks , *POINT cloud - Abstract
The 6D pose estimation using RGBD images plays a pivotal role in robotics applications. At present, after obtaining the RGB and depth modality information, most methods directly concatenate them without considering information interactions. This leads to the low accuracy of 6D pose estimation in occlusion and illumination changes. To solve this problem, we propose a new method to fuse RGB and depth modality features. Our method effectively uses individual information contained within each RGBD image modality and fully integrates cross-modality interactive information. Specifically, we transform depth images into point clouds, applying the PointNet++ network to extract point cloud features; RGB image features are extracted by CNNs and attention mechanisms are added to obtain context information within the single modality; then, we propose a cross-modality feature fusion module (CFFM) to obtain the cross-modality information, and introduce a feature contribution weight training module (CWTM) to allocate the different contributions of the two modalities to the target task. Finally, the result of 6D object pose estimation is obtained by the final cross-modality fusion feature. By enabling information interactions within and between modalities, the integration of the two modalities is maximized. Furthermore, considering the contribution of each modality enhances the overall robustness of the model. Our experiments indicate that the accuracy rate of our method on the LineMOD dataset can reach 96.9%, on average, using the ADD (-S) metric, while on the YCB-Video dataset, it can reach 94.7% using the ADD-S AUC metric and 96.5% using the ADD-S score (<2 cm) metric. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
28. OHO: A Multi-Modal, Multi-Purpose Dataset for Human-Robot Object Hand-Over.
- Author
-
Stephan, Benedict, Köhler, Mona, Müller, Steffen, Zhang, Yan, Gross, Horst-Michael, and Notni, Gunther
- Subjects
- *
ROBOT hands , *THERMOGRAPHY , *MACHINE learning , *PROBLEM solving , *POINT cloud , *ROBOTICS - Abstract
In the context of collaborative robotics, handing over hand-held objects to a robot is a safety-critical task. Therefore, a robust distinction between human hands and presented objects in image data is essential to avoid contact with robotic grippers. To be able to develop machine learning methods for solving this problem, we created the OHO (Object Hand-Over) dataset of tools and other everyday objects being held by human hands. Our dataset consists of color, depth, and thermal images with the addition of pose and shape information about the objects in a real-world scenario. Although the focus of this paper is on instance segmentation, our dataset also enables training for different tasks such as 3D pose estimation or shape estimation of objects. For the instance segmentation task, we present a pipeline for automated label generation in point clouds, as well as image data. Through baseline experiments, we show that these labels are suitable for training an instance segmentation to distinguish hands from objects on a per-pixel basis. Moreover, we present qualitative results for applying our trained model in a real-world application. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Weakly Supervised Pose Estimation of Surgical Instrument from a Single Endoscopic Image
- Author
-
Lihua Hu, Shida Feng, and Bo Wang
- Subjects
6D pose estimation ,weakly supervised learning ,back-propagatable PnP ,Chemical technology ,TP1-1185 - Abstract
Instrument pose estimation is a key demand in computer-aided surgery, and its main challenges lie in two aspects: Firstly, the difficulty of obtaining stable corresponding image feature points due to the instruments’ high refraction and complicated background, and secondly, the lack of labeled pose data. This study aims to tackle the pose estimation problem of surgical instruments in the current endoscope system using a single endoscopic image. More specifically, a weakly supervised method based on the instrument’s image segmentation contour is proposed, with the effective assistance of synthesized endoscopic images. Our method consists of the following three modules: a segmentation module to automatically detect the instrument in the input image, followed by a point inference module to predict the image locations of the implicit feature points of the instrument, and a point back-propagatable Perspective-n-Point module to estimate the pose from the tentative 2D–3D corresponding points. To alleviate the over-reliance on point correspondence accuracy, the local errors of feature point matching and the global inconsistency of the corresponding contours are simultaneously minimized. Our proposed method is validated with both real and synthetic images in comparison with the current state-of-the-art methods.
- Published
- 2024
- Full Text
- View/download PDF
30. 6IMPOSE: bridging the reality gap in 6D pose estimation for robotic grasping
- Author
-
Hongpeng Cao, Lukas Dirnberger, Daniele Bernardini, Cristina Piazza, and Marco Caccamo
- Subjects
6D pose estimation ,RGBD image ,synthetic data ,robotic grasping ,Sim2real ,Mechanical engineering and machinery ,TJ1-1570 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
6D pose recognition has been a crucial factor in the success of robotic grasping, and recent deep learning based approaches have achieved remarkable results on benchmarks. However, their generalization capabilities in real-world applications remain unclear. To overcome this gap, we introduce 6IMPOSE, a novel framework for sim-to-real data generation and 6D pose estimation. 6IMPOSE consists of four modules: First, a data generation pipeline that employs the 3D software suite Blender to create synthetic RGBD image datasets with 6D pose annotations. Second, an annotated RGBD dataset of five household objects was generated using the proposed pipeline. Third, a real-time two-stage 6D pose estimation approach that integrates the object detector YOLO-V4 and a streamlined, real-time version of the 6D pose estimation algorithm PVN3D optimized for time-sensitive robotics applications. Fourth, a codebase designed to facilitate the integration of the vision system into a robotic grasping experiment. Our approach demonstrates the efficient generation of large amounts of photo-realistic RGBD images and the successful transfer of the trained inference model to robotic grasping experiments, achieving an overall success rate of 87% in grasping five different household objects from cluttered backgrounds under varying lighting conditions. This is made possible by fine-tuning data generation and domain randomization techniques and optimizing the inference pipeline, overcoming the generalization and performance shortcomings of the original PVN3D algorithm. Finally, we make the code, synthetic dataset, and all the pre-trained models available on GitHub.
- Published
- 2023
- Full Text
- View/download PDF
31. IPPE-PCR: a novel 6D pose estimation method based on point cloud repair for texture-less and occluded industrial parts.
- Author
-
Qin, Wei, Hu, Qing, Zhuang, Zilong, Huang, Haozhe, Zhu, Xiaodan, and Han, Lin
- Subjects
POINT cloud ,REPAIRING ,EVERYDAY life - Abstract
Fast and accurate 6D pose estimation can help a robot arm grab industrial parts efficiently. The previous 6D pose estimation algorithms mostly target common items in daily life. Few algorithms are aimed at texture-less and occluded industrial parts and there are few industrial parts datasets. A novel method called the Industrial Parts 6D Pose Estimation framework based on point cloud repair (IPPE-PCR) is proposed in this paper. A synthetic dataset of industrial parts (SD-IP) is established as the training set for IPPE-PCR and an annotated real-world, low-texture and occluded dataset of industrial parts (LTO-IP) is constructed as the test set for IPPE. To improve the estimation accuracy, a new loss function is used for the point cloud repair network and an improved ICP method is proposed to optimize template matching. The experiment result shows that IPPE-PCR performs better than the state-of-the-art algorithms on LTO-IP. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. 6D Object Pose Estimation Using a Particle Filter With Better Initialization
- Author
-
Gijae Lee, Jun-Sik Kim, Seungryong Kim, and Kanggeon Kim
- Subjects
6D pose estimation ,centroid prediction network ,particle filter ,robotic grasping ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Estimation of 6D object poses is a key issue in robotic grasping tasks. Recently, many high-performance learning-based methods have been introduced using robust deep learning techniques; however, applying these methods to real robot environments requires many ground truth 6D pose annotations for training. To address this problem, we propose a template matching-based particle filter approach for 6D pose estimation; the proposed method does not require ground truth 6D poses. Although particle filter approaches can stochastically avoid local optima, they require adequate initial pose hypotheses for estimating an accurate 6D object pose. Therefore, we estimated an initial translation of the target object for accurately initializing a particle filter by developing a new deep network. Once the proposed centroid prediction network (CPN) is trained with a specific dataset, no additional training is required for new objects not in the dataset. We evaluated the performance of the CPN and the proposed 6D pose estimation method on benchmark datasets, which demonstrated that the CPN can predict the centroid for any object, including those not in the training data, and that our 6D pose estimation method outperforms existing methods for partially occluded objects. Finally, we tested a grasping task based on our proposed method using a real robot platform to demonstrate an application of our method to a downstream task. This experiment shows that our method can be applied to part assembly, bin picking, and object manipulation without large training datasets with 6D pose annotations. The code and models are available at: https://github.com/oorrppp2/Particle_filter_approach_6D_pose_estimation.
- Published
- 2023
- Full Text
- View/download PDF
33. Attention Guided 6D Object Pose Estimation with Multi-constraints Voting Network
- Author
-
Zuo, Guoyu, Gu, Zonghan, Huang, Gao, Gong, Daoxiong, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Ronzhin, Andrey, editor, Meshcheryakov, Roman, editor, and Xiantong, Zhen, editor
- Published
- 2022
- Full Text
- View/download PDF
34. ShAPO: Implicit Representations for Multi-object Shape, Appearance, and Pose Optimization
- Author
-
Irshad, Muhammad Zubair, Zakharov, Sergey, Ambrus, Rares, Kollar, Thomas, Kira, Zsolt, Gaidon, Adrien, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
35. DProST: Dynamic Projective Spatial Transformer Network for 6D Pose Estimation
- Author
-
Park, Jaewoo, Cho, Nam Ik, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
36. Category-Level 6D Object Pose and Size Estimation Using Self-supervised Deep Prior Deformation Networks
- Author
-
Lin, Jiehong, Wei, Zewei, Ding, Changxing, Jia, Kui, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
37. DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation
- Author
-
Wen, Yilin, Li, Xiangyu, Pan, Hao, Yang, Lei, Wang, Zheng, Komura, Taku, Wang, Wenping, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
38. DCL-Net: Deep Correspondence Learning Network for 6D Pose Estimation
- Author
-
Li, Hongyang, Lin, Jiehong, Jia, Kui, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
- Published
- 2022
- Full Text
- View/download PDF
39. A Method for Robust Object Recognition and Pose Estimation of Rigid Body Based on Point Cloud
- Author
-
Zhao, Guiyu, Ma, Hongbin, Jin, Ying, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Liu, Honghai, editor, Yin, Zhouping, editor, Liu, Lianqing, editor, Jiang, Li, editor, Gu, Guoying, editor, Wu, Xinyu, editor, and Ren, Weihong, editor
- Published
- 2022
- Full Text
- View/download PDF
40. KVNet: An iterative 3D keypoints voting network for real-time 6-DoF object pose estimation.
- Author
-
Wang, Fei, Zhang, Xing, Chen, Tianyue, Shen, Ze, Liu, Shangdong, and He, Zhenquan
- Subjects
- *
VOTING , *VECTOR spaces , *HOUGH transforms , *AUGMENTED reality , *CONTINUOUS functions - Abstract
Accurate and efficient object pose estimation holds the indispensable part of virtual/augmented reality (VR/AR) and many other applications. While previous works focus on directly regressing 6D pose from RGB and depth image and thus suffer from the non-linearity of rotation space, we propose an iterative 3D keypoints voting network, named as KVNet. Specifically, our method decouples the pose into separate translation and rotation branch, both estimated by Hough voting scheme. By treating the uncertainty of keypoints' vote as the Lipschitz continuous function of seed points' fused embedding feature, our method is able to adaptively select the optimal keypoints vote. In this way, we argue that KVNet bridges the gap between the non-linear rotation space and linear Euclidean space, which introduces inductive bias for our network to learn the intrinsic pattern and infer 6D pose from RGB and depth images. Furthermore, our model will refine the initial keypoints localization with iterative fashion. Experiments show that across three challenging benchmark datasets (LineMOD, YCB-Video and Occlusion LineMOD), our method exhibits excellent performance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. 6D-ViCuT: Six degree-of-freedom visual cuboid tracking dataset for manual packing of cargo in warehouses
- Author
-
Guillermo A. Camacho-Muñoz, Juan Camilo Martínez Franco, Sandra Esperanza Nope-Rodríguez, Humberto Loaiza-Correa, Sebastián Gil-Parga, and David Álvarez-Martínez
- Subjects
Intralogistics ,Industrial metaverse ,Packing of cargo ,Point clouds ,RGBD images ,6D pose estimation ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Science (General) ,Q1-390 - Abstract
Visual tracking of objects is a fundamental technology for industry 4.0, allowing the integration of digital content and real-world objects. The industrial operation known as manual cargo packing can benefit from the visual tracking of objects. No dataset exists to evaluate the visual tracking algorithms on manual packing scenarios. To close this gap, this article presents 6D-ViCuT, a dataset of images, and 6D pose ground truth of cuboids in a manual packing operation in intralogistics. The initial release of the dataset comprehends 28 sessions acquired in a space that rebuilds a manual packing zone: indoors, area of (6 × 4 × 2) m3, and warehouse illumination. The data acquisition experiment involves capturing images from fixed and mobile RGBD devices and a motion capture system while an operator performs a manual packing operation. Each session contains between 6 and 18 boxes from an available set of 10 types, with each type varying in height, width, depth, and texture. Each session had a duration in the range of 1 to 5 minutes. Each session exhibits operator speed and box type differences (box texture, size heterogeneity, occlusion).
- Published
- 2023
- Full Text
- View/download PDF
42. Vision-Guided Object Recognition and 6D Pose Estimation System Based on Deep Neural Network for Unmanned Aerial Vehicles towards Intelligent Logistics.
- Author
-
Luo, Sijin, Liang, Yu, Luo, Zhehao, Liang, Guoyuan, Wang, Can, and Wu, Xinyu
- Subjects
ARTIFICIAL neural networks ,POSE estimation (Computer vision) ,OBJECT recognition (Computer vision) ,DRONE aircraft ,HOUSEHOLD employees ,GEOSTATIONARY satellites ,WORKFLOW ,LABOR costs ,THREE-dimensional imaging - Abstract
Unmanned aerial vehicle (UAV) express delivery is facing a period of rapid development and continues to promote the aviation logistics industry due to its advantages of elevated delivery efficiency and low labor costs. Automatic detection, localization, and estimation of 6D poses of targets in dynamic environments are key prerequisites for UAV intelligent logistics. In this study, we proposed a novel vision system based on deep neural networks to locate targets and estimate their 6D pose parameters from 2D color images and 3D point clouds captured by an RGB-D sensor mounted on a UAV. The workflow of this system can be summarized as follows: detect the targets and locate them, separate the object region from the background using a segmentation network, and estimate the 6D pose parameters from a regression network. The proposed system provides a solid foundation for various complex operations for UAVs. To better verify the performance of the proposed system, we built a small dataset called SIAT comprising some household staff. Comparative experiments with several state-of-the-art networks on the YCB-Video dataset and SIAT dataset verified the effectiveness, robustness, and superior performance of the proposed method, indicating its promising applications in UAV-based delivery tasks. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. 局部特征表征的6D位姿估计算法.
- Author
-
王晨露, 陈立家, 李糰, 范贤博俊, 王敏, 连晨轩, 王赞, and 刘名果
- Subjects
- *
SINGULAR value decomposition , *PROBLEM solving , *ALGORITHMS , *MACHINE learning , *POSE estimation (Computer vision) - Abstract
In order to solve the problem of low accuracy of 6D pose estimation for textured models under occlusion, this paper proposes an end-to-end 6D pose estimation algorithm based on local feature representation. Firstly, this paper proposes a Spatial and Coordinate Attention mechanism to obtain accurate localization information. A YOLOv5-CBE detection network forms by adding the attention mechanism to the backbone network and introducing a weighted Bidirectional Feature Pyramid Network in the detection layer. The Precision, Recall and mAP@0.5 of YOLOV5-CBE algorithm rise by 3.6%, 2.8% and 2.5% respectively, and the coordinate error of local feature center point decreases by 25% at most. Secondly, the YOLOv5-CBE network detects the local feature key points and calculates 6D pose of the model with 3 D Harris key points by Singular Value Decomposition, and the algorithm can guarantee 2D reprojection Accuracy and ADD Accuracy above 95% with 70% occlusion, which has a strong robustness. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
44. Cross-Attention-Based Reflection-Aware 6D Pose Estimation Network for Non-Lambertian Objects from RGB Images.
- Author
-
Wu, Chenrui, Chen, Long, and Wu, Shiqing
- Subjects
THREE-dimensional imaging ,POSE estimation (Computer vision) - Abstract
Six-dimensional pose estimation for non-Lambertian objects, such as metal parts, is essential in intelligent manufacturing. Current methods pay much less attention to the influence of the surface reflection problem in 6D pose estimation. In this paper, we propose a cross-attention-based reflection-aware 6D pose estimation network (CAR6D) for solving the surface reflection problem in 6D pose estimation. We use a pseudo-Siamese network structure to extract features from both an RGB image and a 3D model. The cross-attention layers are designed as a bi-directional filter for each of the inputs (the RGB image and 3D model) to focus on calculating the correspondences of the objects. The network is trained to segment the reflection area from the object area. Training images with ground-truth labels of the reflection area are generated with a physical-based rendering method. The experimental results on a 6D dataset of metal parts demonstrate the superiority of CAR6D in comparison with other state-of-the-art models. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Efficient object recognition under cluttered scenes via descriptor-based matching and single point voting.
- Author
-
He, Xiaoge, Liu, Yuanpeng, Zhou, Jun, Zhang, Yuqi, and Wang, Jun
- Subjects
- *
POINT cloud , *SURFACE structure , *POINT set theory , *VOTING , *HISTOGRAMS , *DESCRIPTOR systems - Abstract
This paper addresses the problem of recognizing multiple objects and multiple instances from point clouds. Whereas existing methods utilize descriptors on 3D fields or pointwise voting to achieve this task, our framework takes advantage of both descriptor-based and voting-based schemes to realize more robust and efficient prediction. Specifically, we propose a novel and robust descriptor called an orientation-enhanced fast point feature histogram (OE-FPFH) to describe points in both the object model and scene, and further to build the correspondence set. The OE-FPFH integrates an orientation vector through mining the geometric tensor of the local structure of a surface point, which is more representative than the original FPFH descriptor. To improve voting efficiency, we devise a novel single-point voting mechanism (SPVM), which constructs a unique local reference frame (LRF) on a single point using the orientation vector. The SPVM takes as input the corresponding point set and can generate a pose candidate for each correspondence. The process is realized by matching LRFs from two corresponding points. All pose candidates are subsequently divided into clusters and aggregated using the K -means clustering algorithm to deduce the poses for different objects or instances in the scene. Experiments on three challenging datasets demonstrate that our method is effective, efficient, and robust to occlusions and multiple instances. • Addresses the problem of recognizing multiple objects and multiple instances from point clouds. • Design a new robust descriptor called Direction-Enhanced Fast Point Feature Histogram (OE-FPFH). • Propose a novel single point voting mechanism (SPVM) that uses direction vectors. • Combines with the advantages of feature matching, the anti-interference and robustness of the attitude voting method are enhanced. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. 6D Object Pose Estimation Based on Cross-Modality Feature Fusion
- Author
-
Meng Jiang, Liming Zhang, Xiaohua Wang, Shuang Li, and Yijie Jiao
- Subjects
6D pose estimation ,RGB and depth modality fusion ,attention mechanism ,Chemical technology ,TP1-1185 - Abstract
The 6D pose estimation using RGBD images plays a pivotal role in robotics applications. At present, after obtaining the RGB and depth modality information, most methods directly concatenate them without considering information interactions. This leads to the low accuracy of 6D pose estimation in occlusion and illumination changes. To solve this problem, we propose a new method to fuse RGB and depth modality features. Our method effectively uses individual information contained within each RGBD image modality and fully integrates cross-modality interactive information. Specifically, we transform depth images into point clouds, applying the PointNet++ network to extract point cloud features; RGB image features are extracted by CNNs and attention mechanisms are added to obtain context information within the single modality; then, we propose a cross-modality feature fusion module (CFFM) to obtain the cross-modality information, and introduce a feature contribution weight training module (CWTM) to allocate the different contributions of the two modalities to the target task. Finally, the result of 6D object pose estimation is obtained by the final cross-modality fusion feature. By enabling information interactions within and between modalities, the integration of the two modalities is maximized. Furthermore, considering the contribution of each modality enhances the overall robustness of the model. Our experiments indicate that the accuracy rate of our method on the LineMOD dataset can reach 96.9%, on average, using the ADD (-S) metric, while on the YCB-Video dataset, it can reach 94.7% using the ADD-S AUC metric and 96.5% using the ADD-S score (
- Published
- 2023
- Full Text
- View/download PDF
47. OHO: A Multi-Modal, Multi-Purpose Dataset for Human-Robot Object Hand-Over
- Author
-
Benedict Stephan, Mona Köhler, Steffen Müller, Yan Zhang, Horst-Michael Gross, and Gunther Notni
- Subjects
dataset ,thermal image ,semantic segmentation ,hand-over ,6D pose estimation ,automated labeling ,Chemical technology ,TP1-1185 - Abstract
In the context of collaborative robotics, handing over hand-held objects to a robot is a safety-critical task. Therefore, a robust distinction between human hands and presented objects in image data is essential to avoid contact with robotic grippers. To be able to develop machine learning methods for solving this problem, we created the OHO (Object Hand-Over) dataset of tools and other everyday objects being held by human hands. Our dataset consists of color, depth, and thermal images with the addition of pose and shape information about the objects in a real-world scenario. Although the focus of this paper is on instance segmentation, our dataset also enables training for different tasks such as 3D pose estimation or shape estimation of objects. For the instance segmentation task, we present a pipeline for automated label generation in point clouds, as well as image data. Through baseline experiments, we show that these labels are suitable for training an instance segmentation to distinguish hands from objects on a per-pixel basis. Moreover, we present qualitative results for applying our trained model in a real-world application.
- Published
- 2023
- Full Text
- View/download PDF
48. Uncooperative Satellite 6D Pose Estimation with Relative Depth Information
- Author
-
Song, Jingrui, Hao, Shuling, Xu, Kefeng, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Bebis, George, editor, Athitsos, Vassilis, editor, Yan, Tong, editor, Lau, Manfred, editor, Li, Frederick, editor, Shi, Conglei, editor, Yuan, Xiaoru, editor, Mousas, Christos, editor, and Bruder, Gerd, editor
- Published
- 2021
- Full Text
- View/download PDF
49. Human Pose Estimation in UAV-Human Workspace
- Author
-
Wang, Ju, Choi, Wookjin, Shtau, Igor, Ferro, Tyler, Wu, Zhenhua, Trott, Curtrell, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Stephanidis, Constantine, editor, Kurosu, Masaaki, editor, Chen, Jessie Y. C., editor, Fragomeni, Gino, editor, Streitz, Norbert, editor, Konomi, Shin'ichi, editor, Degen, Helmut, editor, and Ntoa, Stavroula, editor
- Published
- 2021
- Full Text
- View/download PDF
50. 6D Pose Estimation Based on the Adaptive Weight of RGB-D Feature
- Author
-
Zhang, Gengshen, Ning, Li, Feng, Liangbing, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Zhang, Yong, editor, Xu, Yicheng, editor, and Tian, Hui, editor
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.