Descriptor: "FOS: Languages and literature" - Searchworks@Jio Institute Digital Library Search Results

1. Curse of Knowledge

Author: Gibson, Edward, Martinez, Eric, and Mollica, Francis
Subjects: FOS: Psychology, FOS: Languages and literature, Psychology, Linguistics, Legal Studies, Social and Behavioral Sciences
Abstract: Do lawyers have a curse of knowledge when reading and writing legal documents?
Published: 2024
Full Text: View/download PDF

2. The impact of nonverbal behavior on second language proficiency

Author: Burton, John Dylan
Subjects: FOS: Languages and literature, Linguistics, language testing, Applied Linguistics, Social and Behavioral Sciences, language assessment, Educational Assessment, Evaluation, and Research, Education
Abstract: This study aims to investigate the relationship between nonverbal behavior, in particular eye gaze and facial behaviors, on ratings of second language speech.
Published: 2025
Full Text: View/download PDF

3. Discourse-Stylistic Features in Oduduwa Secessionists’ Social Media Campaign

Author: Aminu, PraiseGod
Subjects: Social media, Social movements, Critical discourse analysis, FOS: Languages and literature, Linguistics, Discourse analysis, Digital humanities
Abstract: The Oduduwa secessionist group is an ethnic separatist movement that seeks an independent nation for the Yoruba of southwest Nigeria. The group adopts a radical approach to secessionism and has conducted extensive online campaigns and activism on social media. Unfortunately, there appears to be insufficient linguistic research on discourses produced by this emerging group of activists. Hence, social media campaigns published by the Oduduwa secessionists have been selected to uncover various discourse-stylistic strategies at work from the standpoint of the socio-cognitive model of critical discourse analysis (CDA). While employing a mixed-method approach, this research analyses 500 samples purposively retrieved from Facebook and Twitter. The study reveals the ideological structures concealed in the rhetoric produced by members of the group. The secessionists construct a cognitive binary of positive self-presentation and negative other-representation, categorizing Nigerians as well as the President Muhammadu Buhari administration as the Other. The underlying motive is to reinvent an identity that is different from a typical Nigerian. Since the Oduduwa agitators are a group of individuals determined to secede from Nigeria, the structures of their campaign discourse reflect discourses that reinvent the group’s identity and elucidate their ideological stances. Expectedly, grammatical and discursive structures common to activist discourse, such as hate speech, name-calling, coinages, indexicality, and threatening language that portray prejudice and cultural divisiveness, are evident in discourses produced by the Oduduwa group. Oduduwa separatist movement, as the marginalized group, strongly enunciates their cultural ideology through these strategies, and in pursuit of their self-determination, they accentuate their belief and resistance ideology in ethnic difference and cultural distinctiveness.
Published: 2025
Full Text: View/download PDF

4. El aprendizaje cognitivo y el desarrollo de la producción oral de los estudiantes universitarios de español como lengua extranjera en Egipto con mutismo selectivo. Un estudio cuasi experimental

Author: Naguib, Yara Mohamed Talaat
Subjects: FOS: Psychology, selective mutism, cognitive learning, FOS: Languages and literature, Psychology, Spanish language, Linguistics, applied linguistics, Educational, Anxiety, Foreign language learning, psycholinguistics
Abstract: Resumen La presente investigación pretende analizar el fenómeno de mutismo selectivo en estudiantes egipcios de español como lengua extranjera de nivel B1-B2, realizando un estudio cuasi experimental en el que se aplicará la taxonomía de Bloom, como método de aprendizaje cognitivo. Este estudio aspira poder medir el desarrollo de las habilidades para evocar lo aprendido de la memoria de los estudiantes con mutismo selectivo, partiendo del enfoque del aprendizaje cognitivo. Asimismo, tiene como objetivo presentar una propuesta de programa basada principalmente en el concepto del aprendizaje cognitivo a través de medir el desarrollo de los estudiantes. La metodología elegida, por tanto, tiene como fin desarrollar la capacidad de recordar la lengua a través de la observación y la experimentación, que posteriormente permitirá identificar las mejores estrategias o técnicas para facilitar la evocación de la lengua en el aula. Palabras clave: Aprendizaje cognitivo, taxonomía de Bloom, mutismo selectivo, producción oral, estudiantes de ELE en Egipto
Published: 2023
Full Text: View/download PDF

5. A Psycholinguistic Look at the Role of Field Dependence/Independence in Receptive/Productive Vocabulary Knowledge: Does it Draw a Line?

Author: Kamal Heidari
Subjects: Linguistics and Language, Language Tests, Psycholinguistics, 170199 Psychology not elsewhere classified, Linguistics, Experimental and Cognitive Psychology, Iran, Vocabulary, Language and Linguistics, FOS: Psychology, FOS: Languages and literature, Humans, General Psychology, Language
Abstract: The thrust of this study was to investigate the impact of learning styles in general and Field dependence/Independence (FD/I) in particular on the receptive/productive lexical performance of language learners. It aimed to check whether FD/I learners perform differently on receptive and productive vocabulary tests. To achieve this, first, 94 Iranian language learners were given the Group Embedded Figure Test (GEFT) to determine their learning style; and second, they were put into two groups and were asked to take a receptive and a productive vocabulary test. Having collected and analyzed the data, the study revealed that first, with regard to the receptive test, although FI learners outperformed the FD ones, this outperformance was not significant statistically. Second, for the productive test, a significant difference was found between FIs and FDs with FI learners having a better performance. Third, FI learners acted significantly better in the productive test compared with receptive test. Finally, FD learners performed almost similarly in both receptive and productive tests. The pertinent implications are also discussed.
Published: 2022

6. Pre-Modern Data: Applying Language Modeling and Named Entity Recognition on Criminal Records in the City of Bern

Author: Hodel, Tobias, Prada Ziegler, Ismail, Schneider, Christa, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, History, 100 Philosophy, Long Presentation, Named Entity Recognition, Language Model, Machine Learning, Information Extraction, Pre-Modern Documents, 800 Literature, rhetoric & criticism, Linguistics, Language Model, Named Entity Recognition, text mining and analysis, FOS: Languages and literature, Text Recognition, 900 History, data modeling
Abstract: How can NLP technologies be applied and measured on pre-modern documents? Based on a large handwritten dataset from the tower of Bern, we tested available language models and taggers, showcasing that specific forms of representation and identification need to be found. Relying on cooperation to further improve information retrieval.
Published: 2023
Full Text: View/download PDF

7. Providing Digital Answers to Disciplinary Questions with Graph Literary Exploration Machine

Author: Maryl, Maciej, Karlińska, Agnieszka, Walentynowicz, Wiktor, Walkowiak, Tomasz, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, cultural analytics, Short Presentation, distant reading, concept mining, named entity recognition, topic modelling, web service, Literary studies, text mining and analysis, Humanities computing, FOS: Languages and literature, Linguistics, natural language processing, network analysis and graphs theory and application
Abstract: Graph Literary Exploration Machine (GoLEM) is a new web-based application for literary scholars. The tool allows for named entity relationship analysis, terminology mining, and topic modeling. A strong emphasis is put on the visualisation of results as graphs, time series, maps or scatter plots.
Published: 2023
Full Text: View/download PDF

8. Mapping the (Digital) Linguistic Atlas of Scotland

Author: Pluschkovits, Markus, Kirk, John, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Digitization, Dialectology, public humanities collaborations and methods, Linguistics, digitization (2D & 3D), Language Maps, FOS: Languages and literature, database creation, Philology, Poster, Scots, and analysis, management
Abstract: The Digital Linguistic Atlas of Scotland is a digitization and reanalysis of the lexical section of the Linguistic Atlas of Scotland. The tool offers dynamic and costumizeable maps and aims to showcase the opportunities of the digitization of analogue linguistic research data.
Published: 2023
Full Text: View/download PDF

9. TEITOK API - Programmable DH Corpora

Author: Janssen, Maarten, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, TEI/XML, and methods, representation, annotation structures, Linguistics, Computer science, manuscripts description, Document Enhancement, Humanities computing, FOS: Languages and literature, systems, text encoding and markup language creation, deployment, Poster, and analysis, natural language processing, Programmable Corpora
Abstract: The TEITOK REST API lets you interact with your corpus remotely - to upload documents in various formats - automatically converted to TEI/XML; to render search results; to run an NLP pipeline on the server; or download the content of corpus document, treat that content locally with local NLP tools or manual annotation tools, and then upload the results back to the server, where the new or corrected annotations will be incorporated into the original TEI/XML document without destroying any of the potentially complex annotations already in the document.
Published: 2023
Full Text: View/download PDF

10. Computational Literary Studies Infrastructure (CLS INFRA): Initial Findings and Conclusions for the Field

Author: Birkholz, Julie M., Börner, Ingo, Byszuk, Joanna, Chambers, Sally, Charvat, Vera Maria, Cinková, Silvie, Dejaeghere, Tess, Dudar, Julia, Ďurčo, Matej, Eder, Maciej, Edmond, Jennifer, Fileva, Evgeniia, Fischer, Frank, Garnett, Vicky, Heiden, Serge, Křen, Michal, Kunda, Bartłomiej, Laszakovits, Sabine, Mrugalski, Michał, Papaki, Eliza, Raciti, Marco, Resch, Stefan, Ros, Salvador, Schöch, Christof, Šeļa, Artjoms, Tasovac, Toma, Tonra, Justin, Tóth-Czifra, Erzsébet, Trilcke, Peer, van Dalen-Oskam, Karina, van Rossum, Lisanne, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, Informatics, and ethics analysis, CLS, computational literary studies, public humanities collaborations and methods, digital access, Linguistics, Cultural studies, research infrastructures, privacy, data publishing projects, Literary studies, text mining and analysis, FOS: Languages and literature, systems, Poster
Abstract: The aim of this poster is to provide an overview of the work carried out in the CLS INFRA project and its conclusions for the field of Computational Literary Studies.
Published: 2023
Full Text: View/download PDF

11. Digital Humanities Applications of spaCy's Span Categorizer

Author: Boyd, Adriane, Kádár, Ákos, Janco, Andrew, Lassner, David, Budak, Nick, Tasovac, Toma, Ermolaev, Natalia, Karajgikar, Jajwalya, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, History, representation, spaCy, digital editions, Linguistics, NLP, manuscripts description, Pre-Conference Workshop and Tutorial, Literary studies, text annotation, Humanities computing, NER, FOS: Languages and literature, text encoding and markup language creation, deployment, and analysis, natural language processing, artificial intelligence and machine learning
Abstract: This 3-hour workshop will introduce span categorization as a method for the machine-annotation of text for various research tasks in the digital humanities. Participants will gain a conceptual understanding of how span categorization differs from entity recognition and complete practical exercises to train a spaCy span categorizer on the LitBank dataset.
Published: 2023
Full Text: View/download PDF

12. Humanistic NLP: Bridging the Gap Between Digital Humanities and Natural Language Processing

Author: Tasovac, Toma, Ermolaev, Natalia, Janco, Andrew, Lassner, David, Budak, Nick, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Long Presentation, interdisciplinarity, pedagogy, curricular and pedagogical development and analysis, Humanities computing, FOS: Languages and literature, DH, Linguistics, natural language processing, artificial intelligence and machine learning, NLP, humanities
Abstract: Humanistic NLP articulates the multiple challenges facing interactions between NLP and humanities research and offers solutions to bridge this gap. Not merely an application of NLP in DH, it should be considered as an opportunity to engage more humanist scholars in the development and critical appraisal of NLP language resources.
Published: 2023
Full Text: View/download PDF

13. Exploring genderlect markers in a corpus of Nineteenth century Spanish novels

Author: Bermúdez Sabel, Helena, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, attribution studies and stylometric analysis, Short Presentation, stylometry, FOS: Languages and literature, Linguistics, genderlect, spanish novels, Gender and sexuality studies, sociolinguistics
Abstract: This study teases out gender-specific linguistic features in fiction writing by carrying out a stylometric analysis of 81 Spanish novels written between 1840 and 1919.
Published: 2023
Full Text: View/download PDF

14. Words Shape Characters: A Case Study of Correspondence Analysis on Characters' Words in The Tale of Genji

Author: Takeuchi, Ayano, Ogiso, Toshinobu, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, correspondence analysis, speaker information, Short Presentation, Literary studies, text mining and analysis, FOS: Languages and literature, Japanese literature, corpus, Linguistics, rhetorical analysis, Asian studies
Abstract: The current study investigates characters' words in the oldest extant Japanese novel The Tale of Genji, which was written in the 11th century during the Heian period (794-1192), by utilizing correspondence analysis.
Published: 2023
Full Text: View/download PDF

15. Index of Middle English Prose: A search tool based on language modelling

Author: Honkapohja, Alpo, Thaisen, Jacob, Nøklestad, Anders, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, representation, Book and print history, Linguistics, Middle English, concordancing and indexing, manuscripts description, Language modelling, Literary studies, FOS: Languages and literature, Philology, Poster, and analysis, natural language processing, Manuscript studies
Abstract: The poster/presentation demonstrates a web-based search tool for the Index of Middle English Prose, built using the SRILM language modelling toolkit, which is capable of handling variation in spelling, syntax and lexicon inherent to a prestandardised vernacular such as Middle English.
Published: 2023
Full Text: View/download PDF

16. Datafication and reuse of the descriptions of the incunabula collection at the British Library

Author: Atanassova, Rossitza, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, History, catalogue data, Library & information science, optical character recognition and handwriting recognition, Linguistics, incunabula, metadata standards, computational analysis, text mining and analysis, Humanities computing, FOS: Languages and literature, systems, historical catalogues, Poster, natural language processing, practitioner research
Abstract: The use of computational approaches with legacy catalogue descriptions enables new insights into their authors' historical perspectives, and provides the means for their evaluation and transformation into contemporary online catalogue records. I discuss my computational research with the historical catalogues of the British Library collection of pre-1500 published books.
Published: 2023
Full Text: View/download PDF

17. SylLab – software for semi-automatic stylometric analysis for poetry

Author: Rykowska, Aleksandra, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, attribution studies and stylometric analysis, computer analysis of poetry, Literary studies, sentiment analysis, stylometry, versification, FOS: Languages and literature, Linguistics, Philology, Poster, Computer science, electronic literature production and analysis
Abstract: The poster presents a new software for semi-automatic stylometric analysis of poetry, especially Polish accentual-syllabic verse poetry. Starting from the versological and phonetic analysis of each verse, the program helps to establish the mood of a given poem based on the characteristic and number of phones constructing the text.
Published: 2023
Full Text: View/download PDF

18. Automatic Word Segmentation for Egyptian Hieroglyphic Texts

Author: Jauhiainen, Heidi, Jauhiainen, Tommi, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Short Presentation, Word segmentation, analysis and methods, software development, FOS: Languages and literature, systems, Linguistics, natural language processing, artificial intelligence and machine learning, hieroglyphic texts
Abstract: The scarcity of machine-readable corpora for Egyptian hieroglyphic texts hinders the digital study of ancient Egyptian texts. Computer-assisted transliteration of the texts will speed up producing such texts and the first step is to find word-boundaries. We present a method for the automatic segmentation of hieroglyphic texts.
Published: 2023
Full Text: View/download PDF

19. WebChamame: An Online Tool for Morphological Analysis of Various Historical Japanese Texts using UniDic Dictionaries

Author: Ogiso, Toshinobu, Tsutsumi, Tomoaki, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, History, Linguistics, Japanese language, text mining and analysis, FOS: Languages and literature, morphological analysis, Philology, text processing, Interface design, Poster, and analysis, natural language processing, development, Asian studies
Abstract: Morphological analysis is an indispensable tool for analyzing Japanese text, but it is difficult for ordinary humanities researchers to set up an environment and perform analysis from the command line. Therefore we have developed WebChamame, an online tool that allows users to perform morphological analysis using multiple period-specific UniDic dictionaries.
Published: 2023
Full Text: View/download PDF

20. Towards Diachronic Corpus of Polish Latin

Author: Marszałek, Jagoda, Nowak, Krzysztof, Krawczyk, Iwona, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, image processing and analysis, OCR, Humanities computing, FOS: Languages and literature, Neo-Latin, database creation, optical character recognition and handwriting recognition, Linguistics, Poster, and analysis, diachronic corpora, management
Abstract: This paper presents the results of a study evaluating the feasibility of automatic acquisition of small-scale diachronic Latin corpora for linguistic research. The study was conducted on a collection of Neo-Latin works composed by Polish authors, using tools for automatic segmentation (Kraken) and text recognition (Calamari OCR).
Published: 2023
Full Text: View/download PDF

21. Transforming the Pietist Tradition: Disciplinary Innovation through Linked Digital Engagement

Author: Faull, Katherine Mary, Prell, Martin, Tögel, Philipp, Lasch, Alexander, Garces, Juan, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, History, representation, Linguistics, pietism, knowledge transfer, network analysis and graphs theory and application, collaboration, Theology and religious studies, manuscripts description, networks, FOS: Languages and literature, Panel, text encoding and markup language creation, deployment, and analysis, LOD, Indigenous studies, linked (open) data
Abstract: The panel has as its focus how collaboration in and implementation of DH methods has opened up a) new knowledge networks in the (conservative) field of Pietism and Religious History and b) transformed understandings of traditional disciplinary structures and hierarchies.
Published: 2023
Full Text: View/download PDF

22. Handwritten text recognition applied to the manuscript production of the Carthusian Monastery of Herne in the Fourteenth Century

Author: Haverals, Wouter, Kestemont, Mike, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, Long Presentation, digital manuscript studies, representation, analysis, Book and print history, optical character recognition and handwriting recognition, scholarly editing and editions development, Linguistics, Middle Dutch literature, scribal profiling, digital libraries creation, manuscripts description, handwritten text recognition, Literary studies, Humanities computing, FOS: Languages and literature, and analysis, management
Abstract: This paper contributes to the assessment of various – possibly impacting – factors during the collection of ground truth data for training HTR-systems. By scrutinising different parameters (e.g. scribal hands, handwriting styles, spelling profiles, textual genres, etc.) we will report on the impact of various train-target combinations.
Published: 2023
Full Text: View/download PDF

23. AI-supported indexing of handwritten dialect lexis: The pilot study 'DWA Austria' as a case study

Author: Kunzmann, Markus, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, licensing, Lexicography, Transkribus, copyright, Digitisation, Linguistics, and permissions standards, Dialect, digitization (2D & 3D), Short Presentation, Artificial Intelligence, and processes, FOS: Languages and literature, systems, artificial intelligence and machine learning
Abstract: Traditionally, two approaches have developed in dialectology that focus on researching lexical variation: on the one hand, dialect dictionaries, whose task is to document dialect vocabulary; on the other hand, dialect atlases, whose focus is on linguistic-geographical variation in dialect vocabulary. The Wörterbuch der bairischen Mundarten in Österreich (WBÖ), a long-term project of the Austrian Academy of Sciences (ÖAW), is an undertaking of the first type. Until 2015, the first five volumes (A–Ezzes) were published as printed works; since 2018, the Lexikalisches Informationssystem Österreich/Lexical Information System Austria (LIÖ) has served as the publication platform for the articles starting with the letter F. LIÖ is a cooperation project between the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) of the ÖAW and the FWF Special Research Programme German in Austria (DiÖ) of the University of Vienna. At the moment, the information system only contains content related to the WBÖ project itself. As a lexically oriented platform, however, its content is to be expanded in the coming years, i.e. lexical material from other corpora is also to be made accessible via it. A first expansion of LIÖ will take place in October 2022 with the pilot study DWA Austria, a cooperation project between the Research Center Deutscher Sprachatlas (DSA) of the Philipps University of Marburg and the Department of Linguistics of the ACDH-CH. Within the framework of this cooperation, the entire Austrian surveys of the Deutscher Wortatlas/German Word Atlas (DWA) are to be digitally processed for the first time. Last but not least, the project will serve to expand the paradigm for researching lexical variation described above to include a dialect-geographical component. In this collaboration, the Marburg team will provide the high-resolution scans of the DWA surveys. The team in Vienna is building a model for automatic transliteration on this basis with the help of the Transkribus software, which in turn can be used by the DSA team for the German DWA sheets. The surveys for the DWA were conducted indirectly between 1939 and 1942 and are still among the most comprehensive surveys of the 20th century. Questionnaires were sent to a total of about 50,000 places, 3,700 of which are in the territory of the Republic of Austria. Since previous automatic text recognition methods (i.e. OCR) have only provided insufficient results, the questionnaires, which were mostly handwritten, had to be laboriously transliterated manually. The recent use of artificial intelligence (AI) has been showing promising results for several years. With the help of the Transkribus platform, the Austrian DWA questionnaires are now being captured and made usable as part of a pilot study. Unlike conventional OCR products, Transkribus uses artificial intelligence (AI) to convert the written content of digital records into searchable text. The scans of the DWA sheets were made by the DSA and made available to the ACDH-CH. There, in a first step, they manually transliterate a set of scanned sheets. This step can be supported by already existing models that are, for example, tailored to German Kurrent script. Based on these correct transliterations and the corresponding scans, a model can now be built with the help of Deep Learning that is tailored to the document type. The layout of the text is also taken into account. First models on the minimum amount of training material still showed a rather high error rate (CER Val. 9.4) and have been continuously improved since then. The pilot study DWA Austria shows how data sets that could previously only be used to a limited extent due to time-consuming and costly efforts can now be opened up by AI-supported methods to an extent that would not have been possible with conventional methods. In particular, the example shows how the respective expertise of the individual project partners can bring about a significant increase in efficiency and thus once again illustrates the potential for synergies that result from cooperation. Bibliography: DiÖ = SFB German in Austria. URL: https://www.dioe.at/en/ [2023-05-01] DSA = Forschungszentrum Deutscher Sprachatlas. URL: https://www.uni-marburg.de/en/fb09/dsa [2023-05-01] DWA = Mitzka, Walther & Ludwig Erich Schmitt. 1951 – 1980. Deutscher Wortatlas. Gießen: Schmitz LiÖ = Lexikalisches Informationssystem Österreich. URL: https://lioe.dioe.at/ [2023-05-01] Transkribus. AI powered Handwritten Text Recognition. URL: https://readcoop.eu/transkribus/ [2023-05-01] WBÖ = Österreichischen Akademie der Wissenschaften. 1970 – 2015. Wörterbuch der bairischen Mundarten in Österreich. Wien: Verlag der Österreichischen Akademie der Wissenschaften.
Published: 2023
Full Text: View/download PDF

24. Replicating a Data-Driven Corpus Analysis: The Example of Academic Language

Author: Andresen, Melanie, Pichler, Axel, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, replication, Short Presentation, text mining and analysis, Humanities computing, FOS: Languages and literature, data-driven research, meta-criticism (reflections on digital humanities and humanities computing), Linguistics, natural language processing, academic language
Abstract: Replication studies are generally considered an important way of checking the generalizability of research results. However, in the humanities they are still comparatively rare. We argue that data-driven studies in particular are in need of replication and demonstrate the need in a case study on German academic language.
Published: 2023
Full Text: View/download PDF

25. Using Digital Tools to Create Modern Multi-Search Engine for Polish Historical Dictionaries

Author: Rodek, Ewa, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, digital publishing projects, optical character recognition and handwriting recognition, Linguistics, multi-search engine, Central/Eastern European Studies, FOS: Languages and literature, systems, digital research infrastructures development and analysis, Poster, natural language processing, historical lexycography, dictionary
Abstract: I will present the project of building a database of historical Polish lexicons. The most important dictionaries with Middle Polish material will be digitized in steps: HTR recognition, XML tagging, morphosyntactic tagging. We intend to prepare the API that could be connected with sources of Middle Polish or Latin used in Poland.
Published: 2023
Full Text: View/download PDF

26. A Catalogue of the Hebrew Sounds

Author: Silber-Varod, Vered, Cohen, Evyatar, Strull, Inbar, Cohen, Evan Gary, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, speech processing analysis and methods, Audio information retrieval, Linguistics, and artefact preservation, Short Presentation, data, FOS: Languages and literature, Spoken language, Hebrew, database creation, information retrieval and querying algorithms and methods, sound preservation, and analysis, management, object
Abstract: Our goal is to collect the sounds of Hebrew and to make them accessible for linguistic studies and cultural heritage preservation. Our presentation is focused on the pipeline we have developed and on two main tools that we have developed for it: 1. A Hebrew sound parser and 2. The database platform.
Published: 2023
Full Text: View/download PDF

27. Tracing the invisible translator: stylistic differences in the Dutch translations of the oeuvre of Swedish author Henning Mankell

Author: Wijers, Martje, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, attribution studies and stylometric analysis, Short Presentation, Literary studies, Swedish, stylometry, text mining and analysis, computational literary studies, FOS: Languages and literature, translation, Linguistics, Dutch, Translation studies
Abstract: In this stylometric study, the oeuvre of the Swedish author Henning Mankell is scrutinized and compared to the Dutch translations of his work to find out to what extent the style is influenced by the translator. The main method used is Burrows' Zeta as proposed by Rybicki (2012).
Published: 2023
Full Text: View/download PDF

28. Polyphemus, a lexical database of the Ancient Greek papyri, and the Madrid Wordlist of Ancient Greek

Author: Riaño Rufilanchas, Daniel, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, databases, Greek morphology, Linguistics, Cultural studies, concordancing and indexing, Short Presentation, FOS: Languages and literature, database creation, Philology, Ancient Greek papyri, Greek lexicography, and analysis, natural language processing, Green Grammar, management
Abstract: At present there is no way to search the corpus of Greek papyri for lemmas, or to search for specific grammatical forms of a word. Much less is there a way to search for examples of a grammatical category. Polyphemus comes to solve these shortcomings, and some more. For this purpose we have processed all the papyrus texts from PapyInfo (). This processing is done at the same time as the processing that results in the Callimachus database, which we also present at this DH Congress Congress. I summarize below the procedure by which we obtain our database Polyphemus. A) First we analyze each line of papyrus and differentiate the actual full words from the gaps or non-textual elements. B) Then we identify the complete words and separate them from the fragments. This can sometimes be done because of the editorial criteria used in the original edition, before digitizing. Other times it is necessary to check if the text meets some of the external qualities that define a word in ancient Greek (presence of accentuation, etc.). C) We then proceed to lemmatize each of the words, and determine to which part of speech it corresponds, and what is its morphological analysis. All this is done with the help of the Madrid list, which I will discuss below. For text fragments (incomplete words), we try to see if they can be ascribed to a root. We also separate proper nouns from common nouns. D) Lemma assignment and POS-tagging is performed in two phases. In a first pass we tag the forms with the highest frequency of occurrence. We calculate this from the frequency with which lexical forms were tagged in several manually annotated treebanks (over 700,000 words). We then go on to label all the remaining forms using the Madrid Wordlist. The Madrid Wordlist incorporates information about the dialect in which a form appears, so in case of multiple possible analyses we prefer those belonging to Koine (the Greek form of Papyri Greek) or Attic Greek. Naturally this procedure has the consequence that we reduce the number of multiple analyses for the same form (thus drastically reducing the number of false positives) in exchange for losing the correct POS- tagging for low-frequency forms that coincide with high-frequency ones. E) All this information is transferred to a SQL database, and put in relation with the data on the papyri that we have obtained when creating the Callimachus database. In this way, for each lexical form we obtain a lemma, a non-disambiguated morphological analysis, and a translation or gloss. Each of these parameters can be searched in combination with the more than fifty categories available to us thanks to Callimachus, such as date, origin, category, extension, subject, etc. To date, we have been able to analyze 97% of the complete words, including proper names, which are very numerous. 4. The Madrid Ancient Greek Word List The lemmatization and analysis in Parts Of Speech (POS tagging) is performed by comparing each record in our database with the records of a word list that we have created over the last 3 years, which we have called the Madrid Ancient Greek Wordlist. Most of the Ancient Greek wordlists are evolutions, simplifications, or improvements from the Morpheus list developed by Gregory Crane between 1984 and 1990 (Crane 1991; Celano et al. 2016). Our list also starts with Morpheus, but has been enriched with our own treebank (Aristarchus Treebank, 200,000 words; cf. Riaño 2006), and almost 100,000 proper names from The Lexicon of Greek Personal Names and the Trismegistos repository of papyrological and epigraphic resources. All these data were processed to obtain morphological information. I have manually entered several hundred (mostly irregular) pronominal forms in this list. To complete this list I have processed the digital version of the Greek-English Lexicon of Liddell-Scott-Jones, and extracted all the nominal lemmas; then I have determined the declension of each one of them, and I have proceeded to decline each lemma in its Attic and Ionic form by means of a program we have developed. Then we search for each of these forms in the papyri. The program thus produces over 600,000 lexical forms (many of them already in the Morpheus list). The lemmas are then assigned a translation, or rather a gloss
Published: 2023
Full Text: View/download PDF

29. Creating a collaborative research platform for Vedic Sanskrit texts

Author: Fischer, Anna, Kiss, Börge, Casaretto, Antje, Kölligan, Daniel, Korobzow, Natalie, Neuefeind, Claes, Reinöhl, Uta, Sahle, Patrick, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, Informatics, Sanskrit, analysis and methods, open access methods, Linguistics, data publishing projects, Short Presentation, Humanities computing, software development, FOS: Languages and literature, systems, digital research infrastructures development and analysis, Philology, collaborative platform, Vedic
Abstract: VedaWeb is an established research website for the ancient Vedic text Rigveda. It now evolves into an open collaborative platform for researching and sharing Vedic Sanskrit language data in a broader perspective. How will the new platform stimulate collaborative processes among researchers of Vedic Sanskrit and beyond?
Published: 2023
Full Text: View/download PDF

30. From unstructured texts to RDF-star-based open research data queryable by references

Author: Alassi, Sepideh, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, dependency parsing, Short Presentation, NER, FOS: Languages and literature, RDF-star, Linguistics, natural language processing, LOD, NLP, Computer science, linked (open) data
Abstract: Humanities textual data is full of references to persons and locations given in various languages. Researchers want to perform queries to retrieve data, in which a certain place or a person is mentioned, irrespective of the language of the text. In this paper, I present how we automatically extract named entities (geolocation information and person references) from textual data and homogenize and store them as Linked Open Data (LOD) with unique identifiers such as the GeoName ID and the GND (Gemeinsame Normdatei) number. Then the plain references in the text are substituted with standoff links to the corresponding RDF resources and the textual document is stored in RDF format. This enables humanities scholars to perform advanced SPARQL queries to collect textual resources containing specific references regardless of the language of the text. Furthermore, the relations between these named entities can be parsed from the text based on ontology definition, dependency graph of sentences, and POS tags to be added to the knowledge graph. Since the citability of the information is crucial for humanities research, this workflow adds the metadata regarding the source document of extracted information to the edges of the knowledge graph using RDF-star. This allows queries for documents containing a certain relationship between entities through SPARQL-star.
Published: 2023
Full Text: View/download PDF

31. Gloss-ViBe: Early Medieval Glosses and the Digital Humanities

Author: Bauer, Bernhard, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, Digital Edition, representation, analysis, scholarly editing and editions development, Linguistics, manuscripts description, FOS: Languages and literature, Glossing, text encoding and markup language creation, deployment, Early Medieval, Philology, Poster, and analysis, natural language processing, Manuscripts
Abstract: The proposed poster gives an overview on the Gloss-ViBe project which analyses the early medieval Celtic and Latin glosses found in the fragmentory manuscript Vienna, Österreichische Nationalbibliothek, Codex 15298 (olim Suppl. 2698) from different angles including digital humanities, philology and linguistics.
Published: 2023
Full Text: View/download PDF

32. Nestroy Corpus Analysis (NestroyCA): NLP for 19th century Austrian Drama

Author: Laszakovits, Sabine, Katsikadeli, Christina, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Performance Studies: Dance, Nestroy, Linguistics, NLP, electronic literature production and analysis, Short Presentation, Literary studies, Humanities computing, FOS: Languages and literature, drama, natural language processing, Austrian German, Theatre, historical corpus
Abstract: We present the design and implementation of an NLP pipeline for historical Austrian German (Nestroy's comedies, 1827-1862) including transcriptions of dialectal data. Lacking suitable training data, we created a meta-model that compares the predictions of existing models trained on Modern Standard Germany-German corpora, and calculates a compromise between them.
Published: 2023
Full Text: View/download PDF

33. GitMA Poster

Author: Meister, Malte, Gerstorfer, Dominik, Schumacher, Mareike Katharina, Gius, Evelyn, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Data, and methods, analysis and methods, annotation structures, Linguistics, Git, Literary studies, Humanities computing, software development, FOS: Languages and literature, systems, systems and information architecture and usability, Philology, Poster, CATMA, Python, Visualization
Abstract: Since CATMA version 6.0 project data can be accessed in the form of Git repositories. A Python library which enables easy access to this data was developed at the department of Digital Philology at the Technical University of Darmstadt. It makes it possible to further process annotations using established and popular Python data science tools. The poster will serve as a kind of instruction manual for the use of the CATMA Git Access and the Python library.
Published: 2023
Full Text: View/download PDF

34. Investigating multisemiotic persuasive practices by integrating computational methods and complementary theoretical frameworks. A Data-driven Approach to Digital Tourism Discourse Based on Systemic Functional Linguistics and Empirical Multimodality

Author: Mattei, Elena, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, data-driven analysis, image processing and analysis, and methods, Long Presentation, analysis and methods, systemic functional linguistics, Media studies, annotation structures, Linguistics, software development, FOS: Languages and literature, systems, tourism discourse, digital humanities, empirical multimodality, data modeling, Communication studies
Abstract: This paper offers an understanding of the multilayered methodological framework developed and implemented to carry out a Digital Humanities project. The latter classified systematically visuo-linguistic features in contemporary tourism narratives by means of data-driven tagging models, annotations and statistical measurement of the frequency and variance of strategies across digital channels.
Published: 2023
Full Text: View/download PDF

35. Our Heritage, Our Stories: Democratising the UK national collection

Author: Hannaford, Ewan David, Alexander, Marc, Hughes, Lorna, Lewis, Rhiannon, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, History, community archives, sustainable procedures, Informatics, post-custodial, digital archiving, public humanities collaborations and methods, Linguistics, sustainability, NLP, Computer science, FOS: Languages and literature, systems, Poster, digital humanities, artificial intelligence and machine learning
Abstract: Our Heritage, Our Stories is a collaboration between the University of Glasgow, University of Manchester, and The National Archives, changing how digital content amongst UK communities is collected and curated. This poster outlines project goals, methods, and progress, and showcases new humanities research stories that it will make possible.
Published: 2023
Full Text: View/download PDF

36. Putting (Linguistic) Research Data on a Map – The DiÖ Sprachatlas Tool

Author: Pluschkovits, Markus, Bal, Jakob, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, Dialectology, open access methods, Linguistics, data publishing projects, APIs, Short Presentation, Language Maps, FOS: Languages and literature, systems, Philology, information retrieval and querying algorithms and methods, Visualization
Abstract: The DiÖ Sprachatlas tool is a dynamic map creation tool which uses the corpus of a large-scale variationist linguistics project as its source. By utlizing an API, it creates maps on negligible cost. Additionally, it offers a method of making the total corpus transparent and accessible.
Published: 2023
Full Text: View/download PDF

37. Put Them In to Get Them Out: the ParlaMint Corpora for Digital Humanities and Social Sciences Research

Author: Fišer, Darja, Kryvenko, Anna, Osenova, Petya, Pahor de Maiti, Kristina, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, FOS: Political science, metadata, annotation structures, Linguistics, corpus analysis, concordancing and indexing, FOS: Sociology, Pre-Conference Workshop and Tutorial, parliamentary corpora, linguistic annotation, Sociology, FOS: Languages and literature, systems, parliamentary records, information retrieval and querying algorithms and methods, Political science, Communication studies
Abstract: This hands-on half-day tutorial aims to explore the potential of the ParlaMint corpora – openly available collections of parliamentary records, which are uniformly sampled, annotated and rich in individual speaker and institutional group metadata. We show how the resource facilitates research into specific European parliaments and allows for transnational comparisons.
Published: 2023
Full Text: View/download PDF

38. Transhistorical Resonance: Medieval Chinese Scholarship as Data

Author: Budak, Nicholas Andrew, Rominger, Gian Duri, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, commentary, and methods, ancient, annotation structures, Linguistics, chinese, nlp, phonology, Short Presentation, text mining and analysis, FOS: Languages and literature, systems, Philology, natural language processing, artificial intelligence and machine learning, Asian studies
Abstract: Can we use NLP to extract information about long-dead languages from secondary sources more than a millennium old? This presentation explores the curation of a dataset based on the Jingdian Shiwen (c. 583), a monumental commentary including over 50,000 phonological annotations on ancient Chinese texts.
Published: 2023
Full Text: View/download PDF

39. Few Shot Classification for Labeling of Medieval and Early Modern Charter Texts

Author: Kovács, Tamás, Aoun, Sandy, Vogeler, Georg, Nicolaou, Anguelos, Luger, Daniel, Atzenhofer-Baumgartner, Florian, Lamminger, Florian, Decker, Franziska, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, History, few-shot classification, classification, FOS: Languages and literature, Linguistics, Poster, natural language processing, medieval history, charter, Computer science
Abstract: Our strategy to support filtering the descriptive texts and the transcriptions in Monasterium.net seeks to assign semantic categories for the legal acts they record, such as confiscation, donation, or property sale.
Published: 2023
Full Text: View/download PDF

40. CreoPhonPt: a collaborative database saving Portuguese creoles from digital obliteration

Author: Sousa e Silva, Carlos Rogério, Pimentel Trigo, Luís Manuel, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Linguistics, Phonology, Education/ pedagogy, Short Presentation, Portuguese Creoles, FOS: Languages and literature, database creation, and analysis, natural language processing, management, linked (open) data, African and African American Studies, Asian studies
Abstract: CreoPhon is a pilot database that, for now, only includes Portuguesebased creoles (CreoPhonPt). Its mission is to collect sound phonological data about these languages systematically, to put together a findable, accessible, interoperable, and reusable dataset, and to, by using these data, produce studies that describe so-called "creole phonology" and spread it among the scientific community and the general public.
Published: 2023
Full Text: View/download PDF

41. The COVID-19 pandemic in two Austrian media corpora: methods, analyses, and examples from a lexical and a morpho-pragmatic perspective

Author: Dorn, Amelie, Korecky-Kröll, Katharina, Ziegler, Theresa, Höll, Jan, Lenz, Alexandra N., Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Humanities computing, corpus linguistics, FOS: Languages and literature, COVID-19, Linguistics, Philology, Poster, natural language processing, Computer science, morpho-pragmatics, lexis, data modeling
Abstract: The increased use of digital methods and tools has been steadily gaining momentum in all areas of life and work since the beginning of the COVID-19 pandemic. Extended methods of data processing and analysis could also be developed and applied in linguistics. In this paper we report on lexical and morpho-pragmatic analyzes of linguistic aspects related to the COVID-19 pandemic in two German corpora.
Published: 2023
Full Text: View/download PDF

42. Beyond the Boundaries of Individual Universities: Allegiance to Digital Humanities Education in Korea

Author: Lee, Jae-Yon, Kim, Yongsoo, Ryu, Su-Rin, Mun, Soo-Hyun, Kim, Hyounghun, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, multimodal learning, History, spatial & spatio-temporal analysis, Linguistics, GIS, network analysis and graphs theory and application, Education/ pedagogy, Literary studies, curricular and pedagogical development and analysis, FOS: Languages and literature, language model, Panel, modeling and visualization, DH education, artificial intelligence and machine learning, network analysis
Abstract: This panel is about a DH education project in Korea. The goal is to collaborate in developing DH classes that cross individual universities and are open to participating institutes. We will present the overall design of this project, outline four subjects to be developed, and seek advice from overseas experts
Published: 2023
Full Text: View/download PDF

43. Towards Metadata-enriched Literary Corpora in Line with FAIR Principles: 19/20MetaPNC

Author: Rosiński, Cezary, Karlińska, Agnieszka, Kubis, Marek, Hubar, Patryk, Wieczorek, Jan, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, bibliographic analysis, and methods, FAIR principles, Library & information science, metadata enrichment, Linguistics, literary corpora, metadata standards, Short Presentation, Literary studies, Humanities computing, FOS: Languages and literature, systems, database creation, and analysis, Linked Open Data, management, linked (open) data
Abstract: We aim to introduce a comprehensive workflow for the enrichment and linking the metadata of a literary corpus, including an implementation of FAIR principles, which have been developed in the field of scientific data management. We will present the practical application of the workflow using a corpus of Polish novels.
Published: 2023
Full Text: View/download PDF

44. Characters, names and reference

Author: Johnsen, Lars, Kåsen, Andre, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, Novels, representation, literature, Statistics, Linguistics, Cultural studies, digital libraries creation, cultural analytics, semantic analysis, Literary studies, text mining and analysis, FOS: Languages and literature, FOS: Mathematics, Poster, and analysis, management
Abstract: About representing characters in text. We present an analysis of pronouns, nouns and names in Norwegian novels. The purpose is to arrive at a discourse representation of texts, in terms of characters and their relationships and properties. We use the Norwegian national library and its digital resources for this purpose.
Published: 2023
Full Text: View/download PDF

45. Named Entity Recognition for a Text-Based Catalog of Ancient Greek Authors and Works

Author: Berti, Monica, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, Library & information science, analysis, CITE Architecture, scholarly editing and editions development, Linguistics, literary canon, digital classics, concordancing and indexing, digital philology, Literary studies, FOS: Languages and literature, ancient Greek, Philology, Poster, natural language processing, linked (open) data
Abstract: This poster proposal presents a project whose results are the linguistic annotation of ancient Greek bibliographic references with a focus on Named Entity Recognition related to author names and work titles, in order to produce new dynamic text-based tools that are not available in existing indices and catalogs.
Published: 2023
Full Text: View/download PDF

46. '... ich würde keinen Teufel schonen, möcht' er laborieren oder kollaborieren' – Jean Paul's Letters as Data for Various Research Domains in the Context of the National Research Data Infrastructure Text+

Author: Neuber, Frederike, Hug, Marius, Wiegand, Frank, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, and methods, DH2023, analysis, Text+, DiaCollo, open access methods, scholarly editing and editions development, Linguistics, TEI, BBAW, digital edition, Digital Humanities, NFDI, Wortgeschichte digital, data publishing projects, Literary studies, FOS: Languages and literature, systems, DWDS, digital research infrastructures development and analysis, Letters, Jean Paul, Poster
Abstract: Digital editions, text collections, and lexical resources each have their own tradition in the (digital) humanities and at the same time, the data domains also share common methods and practices of indexing, modeling, and analysis. The Text+ consortium in the German National Research Data Infrastructure (NFDI)is dedicated to the handling of research data in the three aforementioned research domains and is exploring methods and practices of cross-domain use of datasets and the resulting collaboration between research fields. The poster shows the connections between the three data domains in Text+ using a concrete example: the edition of letters by the German writer Jean Paul (Jean Paul Friedrich Richter, 1763–1825) of the Berlin-Brandenburg Academy of Sciences and Humanities (BBAW). The poster was accepted and presented at the DH2023 conference in Graz.
Published: 2023
Full Text: View/download PDF

47. Drafting Standards for Stylometry

Author: Juola, Patrick, Byszuk, Joanna, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, and Helling, Patrick
Subjects: Paper, DH applications, legal studies, attribution studies and stylometric analysis, and methods, public humanities collaborations and methods, Linguistics, Computer science, Pre-Conference Workshop and Tutorial, Law and legal studies, metadata standards, stylometry, Humanities computing, standards, FOS: Languages and literature, systems, natural language processing
Abstract: This workshop will discuss the field of stylometry and authorship attribution, and its application to the development of legal evidence. Workshop participants will be invited to participate in a discussion of the development of consensus standards to help improve reliability and to permit evaluation of proposed stylometric evidence.
Published: 2023
Full Text: View/download PDF

48. Pandemic Protagonists. Viral (Re)Actions in Pandemic and Corona Fictions. Conference Report

Author: Irouschek, Laura, Hobisch, Elisabeth, Obermayr, Julia, and Völkl, Yvonne
Subjects: Cultural Studies, Media, Pandemic, Literature, Literary Studies, Culture, Fiction, FOS: Languages and literature, Corona Fictions, Covid-19, FOS: Other humanities, Film
Abstract: The conference on “Pandemic Protagonists. Viral (Re)Actions in Pandemic and Corona Fictions” marked the culmination of a year of joint work on the eponymous edited volume Pandemic Protagonists published with transcript in April 2023 and represented another milestone in the research project Corona Fictions. On Viral Narratives in Times of Pandemics funded by the Austrian Science Fund FWF (P 34571-G)., {"references":["Völkl, Yvonne; Obermayr, Julia; Hobisch, Elisabeth (eds.) (2023) Pandemic Protagonists. Viral (Re)Actions in Pandemic and Corona Fictions. Bielefeld: transcript.","Hobisch, Elisabeth; Völkl, Yvonne; Obermayr, Julia (2021-) \"Corona Fictions Database\", in: Zotero Group Library. URL: https://www.zotero.org/groups/4814225/corona_fictions_database/library, 2022-11-15.","Hobisch, Elisabeth; Völkl, Yvonne; Obermayr, Julia (2023) \"Corona Fictions Database – Documentation of a Bibliographical Structuring System\", in: Zenodo. DOI: 10.5281/zenodo.7529753"]}
Published: 2023
Full Text: View/download PDF

49. Statistics for field-based linguistics: processing variation. Test data on Irpinian (Montella, Southern Italy) [Annexe]

Author: Wissner, Inka, Roy, Alan, Wissner, Inka, and Roy, Alan
Subjects: Italian, Dialectology, Variational linguistics, Statistics, Linguistics, Probability testing, Student's t-test with Welch's extension, Barlow's method, Hypothesis testing, Sociolinguistics, FOS: Mathematics, FOS: Languages and literature, Bernouilli trial, Field studies
Abstract: This project materiel is to be used as an annexe to a book chapter : Wissner, Inka / Roy, Alan (in progress): “Statistics for field-based linguistics: processing variation”, in: Hummel et al. (edd), Adverbials with preposition and adjective in Romance: field studies in present-day varieties of French, Italian, Portuguese, Romanian and Spanish, for De Gruyter, 34 pages. It provides detailed tables and graphs illustrating the procedure developed for data gathering and processing within the Third Way project on prepositional adverbials from Latin to Romance, a project led by Martin Hummel at the University of Graz (Austria) (https://adjective-adverb.uni-graz.at), financed by the Austrian Science Fund nr. P 30751-G30, 2018-2022. It has been tested with forms chosen randomly from data retrieved by team members Stefan Koch and Cesarina Vecchia in the Irpinian dialect in Campania, in the South of Italy (Montella): a bbacando ‘in vain’, pe ccerto ‘for sure’, and a llieggio ‘empty, empty-handed, without loading, not stuffed’, respectively numbered 27, 5 and 13 according to Wissner (in progress)., Project realized with the collaboration of Third Way Team members (University of Graz), notably S. Koch (https://orcid.org/0000-0002-4486-1552) and C. Vecchia, contractor during the the field studies in Southern Italy (Montella).
Published: 2023
Full Text: View/download PDF

50. KXP and MIL Functors: Proofs in Energy Number Synthesis

Author: Emmerson, Parker
Subjects: Transformations, Fake Number Expositions, Linguistics, Homological Algebra, Notational Language, Pre-numeric Energy, Synthesis, Differentiation of Numeric Energy, Game Theory, FOS: Languages and literature, Energy Number, Star Functor, Numeric Intelligence, Semiotics, Congruency Transforms
Abstract: This paper illustrates the KXP and MIL functors operating within the infinity balancing expression. The existence of a oneness at a balancing of infinity meanings is a living reality. I'm not debating that. There is a oneness at intersections, unions, equilibriums and other forms of logical, algebraic, phenomenological spontaneities. However, mathematics is also a language that notates differentiation between identifiable, significant phenomena. This theory of, "energy numbers," is a priori to the so-called, misnomered, "real numbers," and may or may not be philosophically mappable to the real numbers in a legitimate way. This area is an incipient area for linguistic research. These Energy Numbers are entanglements of quasi-quanta notations, which linguistically combine to deliver the expression of Energy Numbers from a semantic notation of oneness geometry, in fact. Furthermore, the oneness is evidentiary in the, "congruency method," for performing the integral to E, or... synonymously, F sub Lambda. See: Semantics in Tensor Calculus Applications to Set Theory: https://zenodo.org/record/7710307
Published: 2023
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

6,254 results on '"FOS: Languages and literature"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources