Back to Search Start Over

Multilingual Workflows in 'Bullinger Digital': Data Curation for Latin and Early New High German

Authors :
Phillip Benjamin Ströbel
Lukas Fischer
Raphael Müller
Patricia Scheurer
Bernard Schroffenegger
Benjamin Suter
Martin Volk
Source :
Journal of Open Humanities Data, Vol 10, Pp 12-12 (2024)
Publication Year :
2024
Publisher :
Ubiquity Press, 2024.

Abstract

This paper presents how we enhanced the accessibility and utility of historical linguistic data in the project Bullinger Digital. The project involved the transformation of 3,100 letters, primarily available as scanned PDFs, into a dynamic, fully digital format. The expanded digital collection now includes 12,000 letters, 3,100 edited, 5,400 transcribed, and 3,500 represented through detailed metadata and results from handwritten text recognition. Central to our discussion is the innovative workflow developed for this multilingual corpus. This includes strategies for text normalisation, machine translation, and handwritten text recognition, particularly focusing on the challenges of code-switching within historical documents. The resulting digital platform features an advanced search system, offering users various filtering options such as correspondent names, time periods, languages, and locations. It also incorporates fuzzy and exact search capabilities, with the ability to focus searches within specific text parts, like summaries or footnotes. Beyond detailing the technical process, this paper underscores the project’s contribution to historical research and digital humanities. While the Bullinger Digital platform serves as a model for similar projects, the corpus behind it demonstrates the vast potential for data reuse in historical linguistics. The project exemplifies how digital humanities methodologies can revitalise historical text collections, offering researchers access to and interaction with historical data. This paper aims to provide readers with a comprehensive understanding of our project’s scope and broader implications for the field of digital humanities, highlighting the transformative potential of such digital endeavours in historical linguistic research.

Details

Language :
English
ISSN :
2059481X
Volume :
10
Database :
Directory of Open Access Journals
Journal :
Journal of Open Humanities Data
Publication Type :
Academic Journal
Accession number :
edsdoj.3c3e9c1c56744c9eb137e326fb96378d
Document Type :
article
Full Text :
https://doi.org/10.5334/johd.174