Author: "Cardenas, Ronald" / Topic: 6121 languages - Searchworks@Jio Institute Digital Library Search Results

1. GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

Author: Gehrmann, Sebastian, Bhattacharjee, Abhik, Mahendiran, Abinaya, Wang, Alex, Papangelis, Alexandros, Madaan, Aman, McMillan-Major, Angelina, Shvets, Anna, Upadhyay, Ashish, Bohnet, Bernd, Yao, Bingsheng, Wilie, Bryan, Bhagavatula, Chandra, You, Chaobin, Thomson, Craig, Garbacea, Cristina, Wang, Dakuo, Deutsch, Daniel, Xiong, Deyi, Jin, Di, Gkatzia, Dimitra, Radev, Dragomir, Clark, Elizabeth, Durmus, Esin, Ladhak, Faisal, Ginter, Filip, Winata, Genta Indra, Strobelt, Hendrik, Novikova, Jekaterina, Kanerva, Jenna, Chim, Jenny, Zhou, Jiawei, Clive, Jordan, Maynez, Joshua, Sedoc, João, Juraska, Juraj, Dhole, Kaustubh, Raghavi Chandu, Khyathi, Perez-Beltrachini, Laura, Ribeiro, Leonardo F. R., Tunstall, Lewis, Zhang, Li, Pushkarna, Mahima, Creutz, Mathias, White, Michael, Sanjay Kale, Mihir, Kamal Eddine, Moussa, Ammanamanch, Pawan Sasanka, Zhu, Qi, Puduppully, Ratish, Kriz, Reno, Shahriyar, Rifat, Mahamood, Saad, Osei, Salomey, Cahyawijaya, Samuel, Štajner, Sanja, Montella, Sebastien, Jolly, Shailza, Mille, Simon, Shen, Tianhao, Adewumi, Tosin, Raunak, Vikas, Raheja, Vipul, Nikolaev, Vitaly, Tsai, Vivian, Jernite, Yacine, Xu, Ying, Sang, Yisi, Liu, Yixin, Hou, Yufang, Hayashi, Hiroaki, Pu Liang, Paul, Dusek, Ondrej, Subramani, Nishant, Daheim, Nico, Hasan, Tahmid, Cardenas, Ronald, Che, Wanxiang, Shutova, Ekaterina, Department of Digital Humanities, and Language Technology
Subjects: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), Computer Science - Artificial Intelligence, 6121 Languages, 113 Computer and information sciences, Computation and Language (cs.CL), Machine Learning (cs.LG)
Abstract: Evaluation in machine learning is usually informed by past choices, for example which datasets or metrics to use. This standardization enables the comparison on equal footing using leaderboards, but the evaluation choices become sub-optimal as better alternatives arise. This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims. To make following best model evaluation practices easier, we introduce GEMv2. The new version of the Generation, Evaluation, and Metrics Benchmark introduces a modular infrastructure for dataset, model, and metric developers to benefit from each others work. GEMv2 supports 40 documented datasets in 51 languages. Models for all datasets can be evaluated online and our interactive data card creation and rendering tools make it easier to add new datasets to the living benchmark.
Published: 2022
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results on '"Cardenas, Ronald"'

1. GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results on '"Cardenas, Ronald"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources