Back to Search Start Over

LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models

Authors :
Guha, Neel
Nyarko, Julian
Ho, Daniel E.
Ré, Christopher
Chilton, Adam
Narayana, Aditya
Chohlas-Wood, Alex
Peters, Austin
Waldon, Brandon
Rockmore, Daniel N.
Zambrano, Diego
Talisman, Dmitry
Hoque, Enam
Surani, Faiz
Fagan, Frank
Sarfaty, Galit
Dickinson, Gregory M.
Porat, Haggai
Hegland, Jason
Wu, Jessica
Nudell, Joe
Niklaus, Joel
Nay, John
Choi, Jonathan H.
Tobia, Kevin
Hagan, Margaret
Ma, Megan
Livermore, Michael
Rasumov-Rahe, Nikon
Holzenberger, Nils
Kolt, Noam
Henderson, Peter
Rehaag, Sean
Goel, Sharad
Gao, Shang
Williams, Spencer
Gandhi, Sunny
Zur, Tom
Iyer, Varun
Li, Zehua
Publication Year :
2023

Abstract

The advent of large language models (LLMs) and their adoption by the legal community has given rise to the question: what types of legal reasoning can LLMs perform? To enable greater study of this question, we present LegalBench: a collaboratively constructed legal reasoning benchmark consisting of 162 tasks covering six different types of legal reasoning. LegalBench was built through an interdisciplinary process, in which we collected tasks designed and hand-crafted by legal professionals. Because these subject matter experts took a leading role in construction, tasks either measure legal reasoning capabilities that are practically useful, or measure reasoning skills that lawyers find interesting. To enable cross-disciplinary conversations about LLMs in the law, we additionally show how popular legal frameworks for describing legal reasoning -- which distinguish between its many forms -- correspond to LegalBench tasks, thus giving lawyers and LLM developers a common vocabulary. This paper describes LegalBench, presents an empirical evaluation of 20 open-source and commercial LLMs, and illustrates the types of research explorations LegalBench enables.<br />Comment: 143 pages, 79 tables, 4 figures

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2308.11462
Document Type :
Working Paper