Yanıkoğlu, Berrin and Kholmatov, Alisher Anatolyevich (2003) Turkish handwritten text recognition: a case of agglutinative languages. In: Conference on Document Recognition and Retrieval X, Santa Clara, California, USA
PDF
paper.pdf
Download (254kB)
paper.pdf
Download (254kB)
Official URL: http://dx.doi.org/10.1117/12.476045
Abstract
We describe a system for recognizing unconstrained Turkish handwritten text. Turkish has agglutinative morphology and theoretically an infinite number of words that can be generated by adding more suffixes to the word. This makes lexicon-based recognition approaches, where the most likely word is selected among all the alternatives in a lexicon, unsuitable for Turkish. We describe our approach to the problem using a Turkish prefix recognizer. First results of the system demonstrates the promise of this approach, with top-10 word recognition rate of about 40% for a small test data of mixed handprint and cursive writing. The lexicon-based approach with a 17,000 word-lexicon (with test words added) achieves 56% top-10 word recognition rate.
Item Type: | Papers in Conference Proceedings |
---|---|
Uncontrolled Keywords: | handwriting recognition, OCR, Turkish, agglutinative |
Divisions: | Faculty of Engineering and Natural Sciences > Academic programs > Computer Science & Eng. Faculty of Engineering and Natural Sciences |
Depositing User: | Berrin Yanıkoğlu |
Date Deposited: | 17 Feb 2012 10:59 |
Last Modified: | 26 Apr 2022 09:06 |
URI: | https://research.sabanciuniv.edu/id/eprint/18863 |