Back to Search Start Over

Measuring Generalisation to Unseen Viewpoints, Articulations, Shapes and Objects for 3D Hand Pose Estimation under Hand-Object Interaction

Authors :
Shreyas Hampali
Fu Xiong
Shipeng Xie
Zdenek Krnoul
Weiguo Zhou
Sijia Mei
Seungryul Baek
Zhaohui Zhang
Haifeng Sun
Guillermo Garcia-Hernando
Yang Xiao
Dongheui Lee
Angela Yao
Umar Iqbal
Mahdi Rad
Marek Hrúz
Adrian Spurr
Qingfu Wan
Zhiguo Cao
Junsong Yuan
Vincent Lepetit
Pavlo Molchanov
Romain Brégier
Pengfei Ren
Yunhui Liu
Shile Li
Philippe Weinzaepfel
Linlin Yang
Anil Armagan
Weiting Huang
Grégory Rogez
Boshen Zhang
Tae-Kyun Kim
Mingxiu Chen
Jakub Kanis
Source :
Computer Vision – ECCV 2020 ISBN: 9783030585914, ECCV (23)
Publication Year :
2020
Publisher :
arXiv, 2020.

Abstract

We study how well different types of approaches generalise in the task of 3D hand pose estimation under single hand scenarios and hand-object interaction. We show that the accuracy of state-of-the-art methods can drop, and that they fail mostly on poses absent from the training set. Unfortunately, since the space of hand poses is highly dimensional, it is inherently not feasible to cover the whole space densely, despite recent efforts in collecting large-scale training datasets. This sampling problem is even more severe when hands are interacting with objects and/or inputs are RGB rather than depth images, as RGB images also vary with lighting conditions and colors. To address these issues, we designed a public challenge (HANDS'19) to evaluate the abilities of current 3D hand pose estimators (HPEs) to interpolate and extrapolate the poses of a training set. More exactly, HANDS'19 is designed (a) to evaluate the influence of both depth and color modalities on 3D hand pose estimation, under the presence or absence of objects; (b) to assess the generalisation abilities w.r.t. four main axes: shapes, articulations, viewpoints, and objects; (c) to explore the use of a synthetic hand model to fill the gaps of current datasets. Through the challenge, the overall accuracy has dramatically improved over the baseline, especially on extrapolation tasks, from 27mm to 13mm mean joint error. Our analyses highlight the impacts of: Data pre-processing, ensemble approaches, the use of a parametric 3D hand model (MANO), and different HPE methods/backbones.<br />Comment: European Conference on Computer Vision (ECCV), 2020

Details

ISBN :
978-3-030-58591-4
ISBNs :
9783030585914
Database :
OpenAIRE
Journal :
Computer Vision – ECCV 2020 ISBN: 9783030585914, ECCV (23)
Accession number :
edsair.doi.dedup.....334571a0b2fa3856d677209cb23e6322
Full Text :
https://doi.org/10.48550/arxiv.2003.13764