Back to Search Start Over

On the duality between contrastive and non-contrastive self-supervised learning

Authors :
Garrido, Quentin
Chen, Yubei
Bardes, Adrien
Najman, Laurent
Lecun, Yann
Facebook AI Research [Paris] (FAIR)
Facebook
Laboratoire d'Informatique Gaspard-Monge (LIGM)
École des Ponts ParisTech (ENPC)-Centre National de la Recherche Scientifique (CNRS)-Université Gustave Eiffel
Facebook AI Research [New York] (FAIR)
Models of visual object recognition and scene understanding (WILLOW)
Département d'informatique - ENS Paris (DI-ENS)
École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-École normale supérieure - Paris (ENS-PSL)
Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Inria de Paris
Institut National de Recherche en Informatique et en Automatique (Inria)
Courant Institute of Mathematical Sciences [New York] (CIMS)
New York University [New York] (NYU)
NYU System (NYU)-NYU System (NYU)
Center for Data Science
Source :
ICLR 2023-Eleventh International Conference on Learning Representations, ICLR 2023-Eleventh International Conference on Learning Representations, May 2023, Kigali, Rwanda. ⟨10.48550/arXiv.2206.02574⟩
Publication Year :
2022

Abstract

Recent approaches in self-supervised learning of image representations can be categorized into different families of methods and, in particular, can be divided into contrastive and non-contrastive approaches. While differences between the two families have been thoroughly discussed to motivate new approaches, we focus more on the theoretical similarities between them. By designing contrastive and covariance based non-contrastive criteria that can be related algebraically and shown to be equivalent under limited assumptions, we show how close those families can be. We further study popular methods and introduce variations of them, allowing us to relate this theoretical result to current practices and show the influence (or lack thereof) of design choices on downstream performance. Motivated by our equivalence result, we investigate the low performance of SimCLR and show how it can match VICReg's with careful hyperparameter tuning, improving significantly over known baselines. We also challenge the popular assumption that non-contrastive methods need large output dimensions. Our theoretical and quantitative results suggest that the numerical gaps between contrastive and non-contrastive methods in certain regimes can be closed given better network design choices and hyperparameter tuning. The evidence shows that unifying different SOTA methods is an important direction to build a better understanding of self-supervised learning.<br />The Eleventh International Conference on Learning Representations, 2023, Kigali, Rwanda

Details

Language :
English
Database :
OpenAIRE
Journal :
ICLR 2023-Eleventh International Conference on Learning Representations, ICLR 2023-Eleventh International Conference on Learning Representations, May 2023, Kigali, Rwanda. ⟨10.48550/arXiv.2206.02574⟩
Accession number :
edsair.doi.dedup.....4d9826e6ee50ef0aefe814b5d59d067f