Separation of printed text blocks from the non-text areas, containing signatures, handwritten text, logos and other such symbols, is a necessary first step for an OCR involving printed text recognition. In the present work, we compare the efficacy of some feature-classifier combinations to carry out this separation task. We have selected length-normalized horizontal projection profile (HPP) as the starting point of such a separation task. This is with the assumption that the printed text blocks contain lines of text which generate HPP's with some regularity. Such an assumption is demonstrated to be valid.
Added on September 8, 2017
Contributed by : OCR Consortium
Product Type : Research Paper
License Type :
System Requirement :
Author : K. R Arvind,Peeta Basa Pati,A.G. Ramakrishnan