1. STREET: A Multi-Task Structured Reasoning and Explanation Benchmark
- Author
-
Ribeiro, Danilo, Wang, Shen, Ma, Xiaofei, Zhu, Henry, Dong, Rui, Kong, Deguang, Burger, Juliette, Ramos, Anjelica, Wang, William, Huang, Zhiheng, Karypis, George, Xiang, Bing, Roth, Dan, Ribeiro, Danilo, Wang, Shen, Ma, Xiaofei, Zhu, Henry, Dong, Rui, Kong, Deguang, Burger, Juliette, Ramos, Anjelica, Wang, William, Huang, Zhiheng, Karypis, George, Xiang, Bing, and Roth, Dan
- Abstract
We introduce STREET, a unified multi-task and multi-domain natural language reasoning and explanation benchmark. Unlike most existing question-answering (QA) datasets, we expect models to not only answer questions, but also produce step-by-step structured explanations describing how premises in the question are used to produce intermediate conclusions that can prove the correctness of a certain answer. We perform extensive evaluation with popular language models such as few-shot prompting GPT-3 and fine-tuned T5. We find that these models still lag behind human performance when producing such structured reasoning steps. We believe this work will provide a way for the community to better train and test systems on multi-step reasoning and explanations in natural language., Comment: Published in ICLR 2023
- Published
- 2023