Back to Search Start Over

SwissBERT: The Multilingual Language Model for Switzerland

Authors :
Vamvas, Jannis
Graën, Johannes
Sennrich, Rico
Publication Year :
2023

Abstract

We present SwissBERT, a masked language model created specifically for processing Switzerland-related text. SwissBERT is a pre-trained model that we adapted to news articles written in the national languages of Switzerland -- German, French, Italian, and Romansh. We evaluate SwissBERT on natural language understanding tasks related to Switzerland and find that it tends to outperform previous models on these tasks, especially when processing contemporary news and/or Romansh Grischun. Since SwissBERT uses language adapters, it may be extended to Swiss German dialects in future work. The model and our open-source code are publicly released at https://github.com/ZurichNLP/swissbert.<br />Comment: SwissText 2023 [v3: Changed template because the proceedings moved to a different publisher. Same content.]

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2303.13310
Document Type :
Working Paper