Start Over

Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images

Authors :: Fabian, Zalan
Miao, Zhongqi
Li, Chunyuan
Zhang, Yuanhan
Liu, Ziwei
Hernández, Andrés
Montes-Rojas, Andrés
Escucha, Rafael
Siabatto, Laura
Link, Andrés
Arbeláez, Pablo
Dodhia, Rahul
Ferres, Juan Lavista
Publication Year :: 2023
Abstract: Due to deteriorating environmental conditions and increasing human activity, conservation efforts directed towards wildlife is crucial. Motion-activated camera traps constitute an efficient tool for tracking and monitoring wildlife populations across the globe. Supervised learning techniques have been successfully deployed to analyze such imagery, however training such techniques requires annotations from experts. Reducing the reliance on costly labelled data therefore has immense potential in developing large-scale wildlife tracking solutions with markedly less human labor. In this work we propose WildMatch, a novel zero-shot species classification framework that leverages multimodal foundation models. In particular, we instruction tune vision-language models to generate detailed visual descriptions of camera trap images using similar terminology to experts. Then, we match the generated caption to an external knowledge base of descriptions in order to determine the species in a zero-shot manner. We investigate techniques to build instruction tuning datasets for detailed animal description generation and propose a novel knowledge augmentation technique to enhance caption quality. We demonstrate the performance of WildMatch on a new camera trap dataset collected in the Magdalena Medio region of Colombia.<br />Comment: 18 pages, 9 figures

Subjects :: Computer Science - Computer Vision and Pattern Recognition
Computer Science - Machine Learning

Details

Database :: arXiv
Publication Type :: Report
Accession number :: edsarx.2311.01064
Document Type :: Working Paper

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Multimodal Foundation Models for Zero-shot Animal Species Recognition in Camera Trap Images

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources