From Distillation to Hard Negative Sampling

Authors :: Formal, Thibault
Lassance, Carlos
Piwowarski, Benjamin
Clinchant, Stéphane
Source :: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval.
Publication Year :: 2022
Publisher :: ACM, 2022.
Abstract: Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture. In the meantime, sparse representation learning fueled by traditional inverted indexing techniques has seen a growing interest, inheriting from desirable IR priors such as explicit lexical matching. While some architectural variants have been proposed, a lesser effort has been put in the training of such models. In this work, we build on SPLADE -- a sparse expansion-based retriever -- and show to which extent it is able to benefit from the same training improvements as dense models, by studying the effect of distillation, hard-negative mining as well as the Pre-trained Language Model initialization. We furthermore study the link between effectiveness and efficiency, on in-domain and zero-shot settings, leading to state-of-the-art results in both scenarios for sufficiently expressive models.<br />Comment: Accepted at SIGIR22 as a short paper (this work is the extension of SPLADE v2)

Subjects :: FOS: Computer and information sciences
Computer Science - Computation and Language
Computation and Language (cs.CL)
Information Retrieval (cs.IR)
Computer Science - Information Retrieval

Database :: OpenAIRE
Journal :: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
Accession number :: edsair.doi.dedup.....d6e9ca8949782a87cea81363f6a0a000
Full Text :: https://doi.org/10.1145/3477495.3531857