Back to Search
Start Over
Design considerations for a large-scale image-based text search engine in historical manuscript collections
- Source :
- Information Technology, 58(2), 80-88. De Gruyter Oldenbourg
- Publication Year :
- 2016
-
Abstract
- This article gives an overview of design considerations for a handwriting search engine based on pattern recognition and high-performance computing, “Monk”. In order to satisfy multiple and often conflicting technological requirements, an architecture is used which heavily relies on high-performance computing, interactivity, and a Posix file-access model for the scientific programmers. The resulting system is able to handle billions of image files, in the order of petabytes of storage capacity, with a single mount point. Monk is operational since the year 2009.
- Subjects :
- General Computer Science
Database
Point (typography)
Computer science
Full text search
Petabyte
high-performance computing
02 engineering and technology
computer.file_format
computer.software_genre
crowd sourcing
search engines
Search engine
Interactivity
continuous learning
POSIX
020204 information systems
Pattern recognition (psychology)
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Image file formats
handwriting image retrieval
computer
Subjects
Details
- Language :
- English
- ISSN :
- 21967032
- Volume :
- 58
- Issue :
- 2
- Database :
- OpenAIRE
- Journal :
- Information Technology
- Accession number :
- edsair.doi.dedup.....0d18ba1de4231fb36d9d56f4e5766694
- Full Text :
- https://doi.org/10.1515/itit-2015-0049