Click Here to access web-based OCR
The Objective of the OCR system is to develop robust OCR's for printed Indian scripts, which can deliver desired performance for possible conversion of legacy, printed documents into electronically accessible format. The system has been developed for Bangla, Devanagari, Gurumukhi, Kannada, Malayalam, Tamil, Telugu and it will soon be available for Gujarati, Oriya, Tibetan, Assamese, Manipuri, Urdu script in future.
Indian Language OCR being a consortium based project is having a hybrid approach, designed to work with the platform and technology independent modules. This system has been developed to facilitate the digitization of the multi-lingual textual images. The area of coverage of the system is printed text OCR. This system is an outcome of an effort of consortium members sponsored by DeitY. The preprocessing modules such as Noise cleaning, skew detection, binarization modules have been developed various involved consortium institutes. The Language Vertical tasks and integration have been carried out by various consortia members.