Author: "Delmas, Ginger" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Delmas, Ginger"' showing total 8 results

Start Over Author "Delmas, Ginger"

8 results on '"Delmas, Ginger"'

1. PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation

Author: Delmas, Ginger, Weinzaepfel, Philippe, Moreno-Noguer, Francesc, and Rogez, Grégory
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Aligning multiple modalities in a latent space, such as images and texts, has shown to produce powerful semantic visual representations, fueling tasks like image captioning, text-to-image generation, or image grounding. In the context of human-centric vision, albeit CLIP-like representations encode most standard human poses relatively well (such as standing or sitting), they lack sufficient acuteness to discern detailed or uncommon ones. Actually, while 3D human poses have been often associated with images (e.g. to perform pose estimation or pose-conditioned image generation), or more recently with text (e.g. for text-to-pose generation), they have seldom been paired with both. In this work, we combine 3D poses, person's pictures and textual pose descriptions to produce an enhanced 3D-, visual- and semantic-aware human pose representation. We introduce a new transformer-based model, trained in a retrieval fashion, which can take as input any combination of the aforementioned modalities. When composing modalities, it outperforms a standard multi-modal alignment retrieval model, making it possible to sort out partial information (e.g. image with the lower body occluded). We showcase the potential of such an embroidered pose representation for (1) SMPL regression from image with optional text cue; and (2) on the task of fine-grained instruction generation, which consists in generating a text that describes how to move from one 3D pose to another (as a fitness coach). Unlike prior works, our model can take any kind of input (image and/or pose) without retraining., Comment: Published in ECCV 2024
Published: 2024

2. PoseFix: Correcting 3D Human Poses with Natural Language

Author: Delmas, Ginger, Weinzaepfel, Philippe, Moreno-Noguer, Francesc, and Rogez, Grégory
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Automatically producing instructions to modify one's posture could open the door to endless applications, such as personalized coaching and in-home physical therapy. Tackling the reverse problem (i.e., refining a 3D pose based on some natural language feedback) could help for assisted 3D character animation or robot teaching, for instance. Although a few recent works explore the connections between natural language and 3D human pose, none focus on describing 3D body pose differences. In this paper, we tackle the problem of correcting 3D human poses with natural language. To this end, we introduce the PoseFix dataset, which consists of several thousand paired 3D poses and their corresponding text feedback, that describe how the source pose needs to be modified to obtain the target pose. We demonstrate the potential of this dataset on two tasks: (1) text-based pose editing, that aims at generating corrected 3D body poses given a query pose and a text modifier; and (2) correctional text generation, where instructions are generated based on the differences between two body poses., Comment: Published in ICCV 2023
Published: 2023

3. PoseScript: Linking 3D Human Poses and Natural Language

Author: Delmas, Ginger, Weinzaepfel, Philippe, Lucas, Thomas, Moreno-Noguer, Francesc, and Rogez, Grégory
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: Natural language plays a critical role in many computer vision applications, such as image captioning, visual question answering, and cross-modal retrieval, to provide fine-grained semantic information. Unfortunately, while human pose is key to human understanding, current 3D human pose datasets lack detailed language descriptions. To address this issue, we have introduced the PoseScript dataset. This dataset pairs more than six thousand 3D human poses from AMASS with rich human-annotated descriptions of the body parts and their spatial relationships. Additionally, to increase the size of the dataset to a scale that is compatible with data-hungry learning algorithms, we have proposed an elaborate captioning process that generates automatic synthetic descriptions in natural language from given 3D keypoints. This process extracts low-level pose information, known as "posecodes", using a set of simple but generic rules on the 3D keypoints. These posecodes are then combined into higher level textual descriptions using syntactic rules. With automatic annotations, the amount of available data significantly scales up (100k), making it possible to effectively pretrain deep models for finetuning on human captions. To showcase the potential of annotated poses, we present three multi-modal learning tasks that utilize the PoseScript dataset. Firstly, we develop a pipeline that maps 3D poses and textual descriptions into a joint embedding space, allowing for cross-modal retrieval of relevant poses from large-scale datasets. Secondly, we establish a baseline for a text-conditioned model generating 3D poses. Thirdly, we present a learned process for generating pose descriptions. These applications demonstrate the versatility and usefulness of annotated poses in various tasks and pave the way for future research in the field., Comment: TPAMI 2024, extended version of the ECCV 2022 paper
Published: 2022
Full Text: View/download PDF

4. ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity

Author: Delmas, Ginger, de Rezende, Rafael Sampaio, Csurka, Gabriela, and Larlus, Diane
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Information Retrieval
Abstract: An intuitive way to search for images is to use queries composed of an example image and a complementary text. While the first provides rich and implicit context for the search, the latter explicitly calls for new traits, or specifies how some elements of the example image should be changed to retrieve the desired target image. Current approaches typically combine the features of each of the two elements of the query into a single representation, which can then be compared to the ones of the potential target images. Our work aims at shedding new light on the task by looking at it through the prism of two familiar and related frameworks: text-to-image and image-to-image retrieval. Taking inspiration from them, we exploit the specific relation of each query element with the targeted image and derive light-weight attention mechanisms which enable to mediate between the two complementary modalities. We validate our approach on several retrieval benchmarks, querying with images and their associated free-form text modifiers. Our method obtains state-of-the-art results without resorting to side information, multi-level features, heavy pre-training nor large architectures as in previous works., Comment: Published in ICLR 2022
Published: 2022

5. PoseScript: 3D Human Poses from Natural Language

Author: Delmas, Ginger, Weinzaepfel, Philippe, Lucas, Thomas, Moreno-Noguer, Francesc, Rogez, Grégory, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Avidan, Shai, editor, Brostow, Gabriel, editor, Cissé, Moustapha, editor, Farinella, Giovanni Maria, editor, and Hassner, Tal, editor
Published: 2022
Full Text: View/download PDF

6. PoseFix: correcting 3D human poses with natural language

Author: Universitat Politècnica de Catalunya. Doctorat en Automàtica, Robòtica i Visió, Universitat Politècnica de Catalunya. Institut de Robòtica i Informàtica Industrial, CSIC-UPC, Universitat Politècnica de Catalunya. ROBiri - Grup de Percepció i Manipulació Robotitzada de l'IRI, Delmas, Ginger, Weinzaepfel, Philippe, Moreno-Noguer, Francesc, Rogez, Grégory, Universitat Politècnica de Catalunya. Doctorat en Automàtica, Robòtica i Visió, Universitat Politècnica de Catalunya. Institut de Robòtica i Informàtica Industrial, CSIC-UPC, Universitat Politècnica de Catalunya. ROBiri - Grup de Percepció i Manipulació Robotitzada de l'IRI, Delmas, Ginger, Weinzaepfel, Philippe, Moreno-Noguer, Francesc, and Rogez, Grégory
Abstract: Automatically producing instructions to modify one’s posture could open the door to endless applications, such as personalized coaching and in-home physical therapy. Tackling the reverse problem (i.e., refining a 3D pose based on some natural language feedback) could help for assisted 3D character animation or robot teaching, for instance. Although a few recent works explore the connections between natural language and 3D human pose, none focus on describing 3D body pose differences. In this paper, we tackle the problem of correcting 3D human poses with natural language. To this end, we introduce the PoseFix dataset, which consists of several thousand paired 3D poses and their corresponding text feedback, that describe how the source pose needs to be modified to obtain the target pose. We demonstrate the potential of this dataset on two tasks: (1) text-based pose editing, that aims at generating corrected 3D body poses given a query pose and a text modifier; and (2) correctional text generation, where instructions are generated based on the differences between two body poses., Peer Reviewed, Postprint (author's final draft)
Published: 2023

7. PoseScript: 3D human poses from natural language

Author: Delmas, Ginger Diana, Weinzaepfel, Philippe, Lucas, Thomas, Moreno-Noguer, Francesc, Rogez, Grègory, Delmas, Ginger Diana, Weinzaepfel, Philippe, Lucas, Thomas, Moreno-Noguer, Francesc, and Rogez, Grègory
Abstract: Natural language is leveraged in many computer vision tasks such as image captioning, cross-modal retrieval or visual question answering, to provide fine-grained semantic information. While human pose is key to human understanding, current 3D human pose datasets lack detailed language descriptions. In this work, we introduce the PoseScript dataset, which pairs a few thousand 3D human poses from AMASS with rich human-annotated descriptions of the body parts and their spatial relationships. To increase the size of this dataset to a scale compatible with typical data hungry learning algorithms, we propose an elaborate captioning process that generates automatic synthetic descriptions in natural language from given 3D keypoints. This process extracts low-level pose information ¿ the posecodes ¿ using a set of simple but generic rules on the 3D keypoints. The posecodes are then combined into higher level textual descriptions using syntactic rules. Automatic annotations substantially increase the amount of available data, and make it possible to effectively pretrain deep models for finetuning on human captions. To demonstrate the potential of annotated poses, we show applications of the PoseScript dataset to retrieval of relevant poses from large-scale datasets and to synthetic pose generation, both based on a textual pose description.
Published: 2022

8. PoseScript: Linking 3D Human Poses and Natural Language.

Author: Delmas G, Weinzaepfel P, Lucas T, Moreno-Noguer F, and Rogez G
Abstract: Natural language plays a critical role in many computer vision applications, such as image captioning, visual question answering, and cross-modal retrieval, to provide fine-grained semantic information. Unfortunately, while human pose is key to human understanding, current 3D human pose datasets lack detailed language descriptions. To address this issue, we have introduced the PoseScript dataset. This dataset pairs more than six thousand 3D human poses from AMASS with rich human-annotated descriptions of the body parts and their spatial relationships. Additionally, to increase the size of the dataset to a scale that is compatible with data-hungry learning algorithms, we have proposed an elaborate captioning process that generates automatic synthetic descriptions in natural language from given 3D keypoints. This process extracts low-level pose information, known as "posecodes", using a set of simple but generic rules on the 3D keypoints. These posecodes are then combined into higher level textual descriptions using syntactic rules. With automatic annotations, the amount of available data significantly scales up (100k), making it possible to effectively pretrain deep models for finetuning on human captions. To showcase the potential of annotated poses, we present three multi-modal learning tasks that utilize the PoseScript dataset. Firstly, we develop a pipeline that maps 3D poses and textual descriptions into a joint embedding space, allowing for cross-modal retrieval of relevant poses from large-scale datasets. Secondly, we establish a baseline for a text-conditioned model generating 3D poses. Thirdly, we present a learned process for generating pose descriptions. These applications demonstrate the versatility and usefulness of annotated poses in various tasks and pave the way for future research in the field. The dataset is available at https://europe.naverlabs.com/research/computer-vision/posescript/.
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

8 results on '"Delmas, Ginger"'

1. PoseEmbroider: Towards a 3D, Visual, Semantic-aware Human Pose Representation

2. PoseFix: Correcting 3D Human Poses with Natural Language

3. PoseScript: Linking 3D Human Poses and Natural Language

4. ARTEMIS: Attention-based Retrieval with Text-Explicit Matching and Implicit Similarity

5. PoseScript: 3D Human Poses from Natural Language

6. PoseFix: correcting 3D human poses with natural language

7. PoseScript: 3D human poses from natural language

8. PoseScript: Linking 3D Human Poses and Natural Language.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

8 results on '"Delmas, Ginger"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources