1. Workshop HTR-United: metadata, quality control and sharing process for HTR training data
- Author
-
Clérice, Thibault, Chagué, Alix, Scholger, Walter, Vogeler, Georg, Tasovac, Toma, Baillot, Anne, Raunig, Elisabeth, Scholger, Martina, Steiner, Elisabeth, Centre for Information Modelling, Helling, Patrick, Automatic Language Modelling and ANAlysis & Computational Humanities (ALMAnaCH), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Centre Jean Mabillon (CJM), École nationale des chartes (ENC), Université Paris sciences et lettres (PSL)-Université Paris sciences et lettres (PSL), École Pratique des Hautes Études (EPHE), Université Paris sciences et lettres (PSL), Université de Montréal (UdeM), Alliance of Digital Humanities Organizations, and University of Graz
- Subjects
Paper ,standardization ,and methods ,History ,Handwritten Text Recognition ,datasets ,optical character recognition and handwriting recognition ,Computer science ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,[SHS]Humanities and Social Sciences ,Pre-Conference Workshop and Tutorial ,data publishing projects ,metadata standards ,Literary studies ,systems ,Philology ,artificial intelligence and machine learning ,ground truth - Abstract
This workshop uses the environment created around the HTR-United catalog to demonstrate and discuss how to build a dataset of ground truth for text recognition and document it, and how to use HTR-United and its suite of tools to control its quality and describe it in a standardized way.
- Published
- 2023
- Full Text
- View/download PDF