Back to Search Start Over

Computational Literary Studies Infrastructure (CLSINFRA): a H2020 Research Infrastructure Project that aids to connect researchers, data, and methods

Authors :
Birkholz, Julie M.
Börner, Ingo
Chambers, Sally
Cinková, Silvie
van Dalen-Oskam, Karina
Dejaeghere, Tess
Dudar, Julia
Eder, Maciej
Edmond, Jennifer
Garnett, Vicky
Kren, Michal
Mrugalski, Michal
Murphy, Ciara L.
Odebrecht, Carolin
Papaki, Eliza
Raciti, Marco
van Rossum, Lisanne
Schöch, Christof
Šela, Artjoms
Sharma, Srishti
Tonra, Justin
Tóth-Czifra, Erzsébet
Trilcke, Peer
Computationele Literatuurwetenschap (HI)
Source :
DH Benelux 2022
Publication Year :
2022

Abstract

The aim of this poster is to provide an overview of the principal objectives of the newly started H2020 Computational Literary Studies (CLS) project- https://www.clsinfra.io. CLS is a infrastructure project works to develop and bring together resources of high-quality data, tools and knowledge to aid new approaches to studying literature in the digital age. Conducting computational literary studies has a number of challenges and opportunities from multilingual and bringing together distributing information. At present, the landscape of literary data is diverse and fragmented. Even though many resources are currently available in digital libraries, archives, repositories, websites or catalogues, a lack of standardisation hinders how they are constructed, accessed and the extent to which they are reusable (Ciotti 2014). CLS project aims to federate these resources, with the tools needed to interrogate them, and with a widened base of users, in the spirit of the FAIR and CARE principles (Wilkinson et al. 2016). The resulting improvements will benefit researchers by bridging gaps between greater- and lesser- resourced communities in computational literary studies and beyond, ultimately offering opportunities to create new research and insight into our shared and varied European cultural heritage. Rather than building entirely new resources for literary studies, the project is committed to exploiting and connecting the already-existing efforts and initiatives, in order to acknowledge and utilize the immense human labour that has already been undertaken. Therefore, the project builds on recently- compiled high-quality literary corpora, such as DraCor and ELTeC (Fischer et al. 2019, Burnard et al. 2021, Schöch et al. in press), integrates existing tools for text analysis, e.g. TXM, stylo, multilingual NLP pipelines (Heiden 2010, Eder et al. 2016), and takes advantage of deep integration with two other infrastructural projects, namely the CLARIN and DARIAH ERICs. Consequently, the project aims at building a coherent ecosystem to foster the technical and intellectual findability and accessibility of relevant data. The ecosystem consists of (1) resources, i.e. text collections for drama, poetry and prose in several languages, (2) tools, (3) methodological and theoretical considerations, (4) a network of CLS scholars based at different European institutions, (5) a system of short-term research stays for both early career researchers and seasoned scholars, (6) a repository for training materials, as well as (7) an efficient dissemination strategy. This is achieved through a collaboration between participating institutions: Institute of Polish Language at the Polish Academy of Sciences, Poland; University of Potsdam, Germany; Austrian Academy of Sciences, Austria; National University of Distance Education, Spain; École Normale Supérieure de Lyon, France; Humboldt University of Berlin, German; Charles University, Czech Republic; Digital Research Infrastructure for the Arts and Humanities, France; Ghent Centre for Digital Humanities, Ghent University, Belgium; Belgrade Centre for Digital Humanities, Serbia; Huygens Institute for the History of the Netherlands (Royal Netherlands Academy of Arts and Sciences), Netherlands; Trier Center for Digital Humanities, Trier University, Germany; Moore Institute, National University of Ireland Galway, Ireland; This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 101004984. References Ciotti, Fabio. 2014. „Digital literary and cultural studies: the state of the art and perspectives“.Between4/8, 1-17.https://doi.org/10.13125/2039-6597/1392. Borgman, Christine. 2010. Scholarship in the Digital Age : Information, Infrastructure, andthe Internet. Cambridge, Mass & London: MIT Press. See https://www.dariah.euandhttps://www.clarin.eu. Burnard, Lou, Christof Schöch, and Carolin Odebrecht. 2021. „In search of comity: TEI fordistant reading“.Journal of the Text Encoding Initiative. https://doi.org/10.4000/jtei.3500. Eder, M., Rybicki, J. and Kestemont, M. 2016. Stylometry with R: a package forcomputational text analysis.R Journal, 8(1): 107-21.https://journal.r-project.org/archive/2016/RJ-2016-007/index.html Fischer, Frank, Ingo Börner, Matthias Göbel, Andrea Hechtl, Christopher Kittel, P. Miling, andPeer Trilcke. 2019. „Programmable Corpora: Introducing DraCor, an Infrastructure for theResearch on European Drama“. InBook of Abstractsof the Digital Humanities Conference2019. Utrecht: ADHO. Heiden, Serge. 2010. The TXM Platform: Building Open-Source Textual Analysis SoftwareCompatible with the TEI Encoding Scheme. In24th PacificAsia Conference on Language,Information and Computation(pp. 10 p.). Sendai, Japon.Retrieved fromhttp://halshs.archivesouvertes.fr/docs/00/54/97/64/PDF/paclic24_sheiden.pdf Schöch, Christof, Tomaz Erjavec, Roxana Patras, and Diana Santos (in press). „Creatingthe European Literary Text Collection (ELTeC): Challenges and Perspectives”.ModernLanguages Open. Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, MylesAxton, Arie Baak, Niklas Blomberg. 2016. „The FAIR Guiding Principles for Scientific DataManagement and Stewardship“.Scientific Data 3(1).https://doi.org/10.1038/sdata.2016.18.

Details

Language :
English
Database :
OpenAIRE
Journal :
DH Benelux 2022
Accession number :
edsair.doi.dedup.....74f51ae828d9fcb6142354a2cafe1ece