Back to Search
Start Over
Construction of language models for an handwritten mail reading system
- Source :
- DRR, Document Recognition and Retrieval XIX, IS&T/SPIE 24th Annual Symposium on Electronic Imaging-Document Recognition and Retrieval XIX, IS&T/SPIE 24th Annual Symposium on Electronic Imaging-Document Recognition and Retrieval XIX, Jan 2012, San Francisco, United States. ⟨10.1117/12.911965⟩
- Publication Year :
- 2012
- Publisher :
- SPIE, 2012.
-
Abstract
- International audience; This paper presents a system for the recognition of unconstrained handwritten mails. The main part of this system is an HMM recognizer which uses trigraphs to model contextual information. This recognition system does not require any segmentation into words or characters and directly works at line level. To take into account linguistic information and enhance performance, a language model is introduced. This language model is based on bigrams and built from training document transcriptions only. Different experiments with various vocabulary sizes and language models have been conducted. Word Error Rate and Perplexity values are compared to show the interest of specific language models, fit to handwritten mail recognition task.
- Subjects :
- language modeling
Vocabulary
Perplexity
Computer science
Bigram
Speech recognition
media_common.quotation_subject
Word error rate
computer.software_genre
Intelligent word recognition
Offline Handwriting recognition
Rule-based machine translation
Transcription (linguistics)
handwritten mail
Hidden Markov model
Hidden Markov Models
text-line recognition
media_common
business.industry
n-gram
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
ComputingMethodologies_PATTERNRECOGNITION
Cache language model
Artificial intelligence
Language model
business
computer
Natural language processing
Subjects
Details
- ISSN :
- 0277786X
- Database :
- OpenAIRE
- Journal :
- SPIE Proceedings
- Accession number :
- edsair.doi.dedup.....f0a3778ba009fc856061d9c4bbdfa474
- Full Text :
- https://doi.org/10.1117/12.911965