This is the first comprehensive text on Optical Character Recognition for Indic scripts. It covers many topics and describes OCR systems for eight different scriptsBangla, Devanagari, Gurmukhi, Gujarti, Kannada, Malayalam, Tamil and Urdu.
Part I: Recognition of Indic Scripts Building Data Sets for Indian Language OCR ResearchC. V. Jawahar, Anand Kumar, A. Phaneendra and K.J. Jinesh On OCR of major Indian scripts: Bangla and DevanagariB. B. Chaudhari A Complete Machine Printed Gurmukhi OCR SystemGurpreet Singh Lehal Progress in Gujarati Document Processing and Character RecognitionJignesh Dholakia, Atul Negi and S. Rama Mohan Design of a bilingual Kannada-English OCRR. S. Umesh , P. B. Pati and A. G. Ramakrishnan Recognition of Malayalam DocumentsN. V. Neeba , Anoop Namboodiri, C. V. Jawahar and P. J. Narayanan A Complete OCR System for Tamil Magazine DocumentsK. H. Aparna and V. S. Chakravarthy Experiments on Urdu Text RecognitionOmar Mukhtar, Srirangaraj Setlur and Venu Govindaraju The BBN Byblos Hindi OCR SystemPrem Natarajan, Ehry MacRostie, and Michael Decerbo Generalization of Hindi OCR using Adaptive Segmentation and Font FilesMudit Agrawal, Huanfeng Ma and David Doermann Online Handwriting Recognition for Indic ScriptsA. Bharath and Sriganesh Madhvanath Part II: Retrieval of Indic Documents Enhancing Access to Primary Cultural Heritage Materials of IndiaPeter M. Scharf and Malcolm Hyman Digital Image Enhancement of Indic Historical ManuscriptsZhixin Shi, Srirangaraj Setlur and Venu Govindaraju GFG based Compression and Retrieval of Document Images in Indian ScriptsGaurav Harit, Shantanu Chaudhary and Ritu Garg Word spotting for Indic documents to facilitate retrievalAnurag Bhardwaj, Srirangaraj Setlur, Venu Govindaraju Indian Language Information RetrievalPrasenjit Majumder and Mandar Mitra
Optical Character Recognition (OCR) is a key enabling technology critical to creating indexed, digital library content, and it lc+