Back to Search Start Over

The ParlaMint corpora of parliamentary proceedings

Authors :
Erjavec, Tomaž
Ogrodniczuk, Maciej
Osenova, Petya
Ljubešić, Nikola
Simov, Kiril
Pančur, Andrej
Rudolf, Michał
Kopp, Matyáš
Barkarson, Starkaður
Steingrímsson, Steinþór
Çöltekin, Çağrı
de Does, Jesse
Depuydt, Katrien
Agnoloni, Tommaso
Venturi, Giulia
Pérez, María Calzada
de Macedo, Luciana D.
Navarretta, Costanza
Luxardo, Giancarlo
Coole, Matthew
Rayson, Paul
Morkevičius, Vaidas
Krilavičius, Tomas
Darǵis, Roberts
Ring, Orsolya
van Heusden, Ruben
Marx, Maarten
Fišer, Darja
Erjavec, Tomaž
Ogrodniczuk, Maciej
Osenova, Petya
Ljubešić, Nikola
Simov, Kiril
Pančur, Andrej
Rudolf, Michał
Kopp, Matyáš
Barkarson, Starkaður
Steingrímsson, Steinþór
Çöltekin, Çağrı
de Does, Jesse
Depuydt, Katrien
Agnoloni, Tommaso
Venturi, Giulia
Pérez, María Calzada
de Macedo, Luciana D.
Navarretta, Costanza
Luxardo, Giancarlo
Coole, Matthew
Rayson, Paul
Morkevičius, Vaidas
Krilavičius, Tomas
Darǵis, Roberts
Ring, Orsolya
van Heusden, Ruben
Marx, Maarten
Fišer, Darja
Publication Year :
2022

Abstract

This paper presents the ParlaMint corpora containing transcriptions of the sessions of the 17 European national parliaments with half a billion words. The corpora are uniformly encoded, contain rich meta-data about 11 thousand speakers, and are linguistically annotated following the Universal Dependencies formalism and with named entities. Samples of the corpora and conversion scripts are available from the project’s GitHub repository, and the complete corpora are openly available via the CLARIN.SI repository for download, as well as through the NoSketch Engine and KonText concordancers and the Parlameter interface for on-line exploration and analysis.

Details

Database :
OAIster
Notes :
text, https://eprints.lancs.ac.uk/id/eprint/165473/1/s10579_021_09574_0.pdf, English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1348639373
Document Type :
Electronic Resource