Antonio Agudo, Francesc Moreno-Noguer, Alberto Sanfeliu, Albert Pumarola, Vincent Lepetit, Lorenzo Porzi, Google, Ministerio de Economía y Competitividad (España), European Commission, Agudo, Antonio [0000-0001-6845-4998], Moreno-Noguer, Francesc [0000-0002-8640-684X], Institut de Robòtica i Informàtica Industrial (IRI), Universitat Politècnica de Catalunya [Barcelona] (UPC)-Consejo Superior de Investigaciones Científicas [Spain] (CSIC), Institut de Recherche Interdisciplinaire [Villeneuve d'Ascq] (IRI), Université de Lille, Sciences et Technologies-Université de Lille, Droit et Santé-Centre National de la Recherche Scientifique (CNRS), Laboratoire Bordelais de Recherche en Informatique (LaBRI), Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB)-Université Sciences et Technologies - Bordeaux 1-Université Bordeaux Segalen - Bordeaux 2, Institut de Robòtica i Informàtica Industrial, Universitat Politècnica de Catalunya. Departament d'Enginyeria de Sistemes, Automàtica i Informàtica Industrial, Universitat Politècnica de Catalunya. VIS - Visió Artificial i Sistemes Intel.ligents, Universitat Politècnica de Catalunya. ROBiri - Grup de Robòtica de l'IRI, Consejo Superior de Investigaciones Científicas [Madrid] (CSIC)-Universitat Politècnica de Catalunya [Barcelona] (UPC), Centre National de la Recherche Scientifique (CNRS)-Université de Lille, Droit et Santé-Université de Lille, Sciences et Technologies, Université de Bordeaux (UB)-Centre National de la Recherche Scientifique (CNRS)-École Nationale Supérieure d'Électronique, Informatique et Radiocommunications de Bordeaux (ENSEIRB), Lepetit, Vincent, Agudo, Antonio, Moreno-Noguer, Francesc, and Universitat Politècnica de Catalunya. VIS - Visió Artificial i Sistemes Intel·ligents
Trabajo presentado en la IEEE/CVF Conference on Computer Vision and Pattern Recognition, celebrada en Salt Lake City (UT, USA), del 18 al 23 de junio de 2018, We propose a method for predicting the 3D shape of a deformable surface from a single view. By contrast with previous approaches, we do not need a pre-registered template of the surface, and our method is robust to the lack of texture and partial occlusions. At the core of our approach is a geometry-aware deep architecture that tackles the problem as usually done in analytic solutions: first perform 2D detection of the mesh and then estimate a 3D shape that is geometrically consistent with the image. We train this architecture in an end-to-end manner using a large dataset of synthetic renderings of shapes under different levels of deformation, material properties, textures and lighting conditions. We evaluate our approach on a test split of this dataset and available real benchmarks, consistently improving state-of-the-art solutions with a significantly lower computational time., This work is supported in part by a Google Faculty Research Award, by the Spanish Ministry of Science and Innovation under projects HuMoUR TIN2017- 90086-R, ColRobTransp DPI2016-78957 and Mar´ıa de Maeztu Seal of Excellence MDM-2016-0656; and by the EU project AEROARMS ICT-2014-1-644271. We also thank Nvidia for hardware donation.