About OCR System

A simplified robust OCR Software for printed Indian scripts, which can deliver reasonable performance for possible conversion of legacy, printed documents into electronically accessible format.This System is an outcome of effort of consortium members sponsored by Ministry of Electronics and Information Technology.The preprocessing modules such as Noise cleaning,Skew detection, binarization modules have been developed by different consortium institutes. The Language Vertical tasks and integration have been carried out by various consortia members.

OCR Software Overview

List of features


The potential of e-Aksharayan is enormous as it enables users to harness the power of computers to access printed documents in Indian language/scripts.


Present version of e-Aksharayan supports major Indian languages- Hindi, Bangla, Malayalam, Gurmukhi, Tamil, Kannada & Assamese.


It converts printed document images to editable text with upto 90-95% recognition accuracy at character level & 85-90% at word level.


Current version of e-Aksharayan takes 45 to 60 sec to process an A4 size document.


Input and output specifications

  • Works on Windows Operating System.
  • The Software supports BMP,TIFF & PNG formats.
  • Output formats supported are RTF,TXT,DOC.
  • Gray level and black ’n’ white images can be given as input.
  • Image dimensions up to 3500 × 3500 pixels.
  • Minimum scanning resolution supported 300dpi.
  • Maximum input skew supported 15degrees.

Download & Try the free version!!! System recognize up to 5 pages at a time.

Contact us

Technology Development for Indian Languages (Room No 2072),

Ministry of Electronics & Information Technology, Electronics Niketan, 6, CGO Complex, New Delhi - 110 003