Start Over

Multitask Prompted Training Enables Zero-Shot Task Generalization

Authors :: Sanh, Victor
Webson, Albert
Raffel, Colin
Bach, Stephen H.
Sutawika, Lintang
Alyafeai, Zaid
Chaffin, Antoine
Stiegler, Arnaud
Scao, Teven Le
Raja, Arun
Dey, Manan
Bari, M Saiful
Xu, Canwen
Thakker, Urmish
Sharma, Shanya Sharma
Szczechla, Eliza
Kim, Taewoon
Chhablani, Gunjan
Nayak, Nihal
Datta, Debajyoti
Chang, Jonathan
Jiang, Mike Tian-Jian
Wang, Han
Manica, Matteo
Shen, Sheng
Yong, Zheng Xin
Pandey, Harshit
Bawden, Rachel
Wang, Thomas
Neeraj, Trishala
Rozen, Jos
Sharma, Abheesht
Santilli, Andrea
Fevry, Thibault
Fries, Jason Alan
Teehan, Ryan
Bers, Tali
Biderman, Stella
Gao, Leo
Wolf, Thomas
Rush, Alexander M.
Publication Year :: 2021
Abstract: Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that this is a consequence of implicit multitask learning in language models' pretraining (Radford et al., 2019). Can zero-shot generalization instead be directly induced by explicit multitask learning? To test this question at scale, we develop a system for easily mapping any natural language tasks into a human-readable prompted form. We convert a large set of supervised datasets, each with multiple prompts with diverse wording. These prompted datasets allow for benchmarking the ability of a model to perform completely held-out tasks. We fine-tune a pretrained encoder-decoder model (Raffel et al., 2020; Lester et al., 2021) on this multitask mixture covering a wide variety of tasks. The model attains strong zero-shot performance on several standard datasets, often outperforming models up to 16x its size. Further, our approach attains strong performance on a subset of tasks from the BIG-bench benchmark, outperforming models up to 6x its size. All trained models are available at https://github.com/bigscience-workshop/t-zero and all prompts are available at https://github.com/bigscience-workshop/promptsource.<br />Comment: ICLR 2022 Spotlight (with extended discussion)