Abstract
The paper describes a neural network-based script identification system which can be used in the machine reading of documents written in English, Hindi and Kannada language scripts. Script identification is a basic requirement in automation of document processing, in multi-script, multi-lingual environments. The system developed includes a feature extractor and a modular neural network. The feature extractor consists of two stages. In the first stage the document image is dilated using 3 x 3 masks in horizontal, vertical, right diagonal, and left diagonal directions. In the next stage, average pixel distribution is found in these resulting images. The modular network is a combination of separately trained feedforward neural network classifiers for each script. The system recognizes 64 x 64 pixel document images. In the next level, the system is modified to perform on single word-document images in the same three scripts. Modified system includes a pre-processor, modified feature extractor and probabilistic neural network classifier. Pre-processor segments the multi-script multi-lingual document into individual words. The feature extractor receives these word-document images of variable size and still produces the discriminative features employed by the probabilistic neural classifier. Experiments are conducted on a manually developed database of document images of size 64 x 64 pixels and on a database of individual words in the three scripts. The results are very encouraging and prove the effectiveness of the approach.
| Original language | English |
|---|---|
| Pages (from-to) | 83-97 |
| Number of pages | 15 |
| Journal | Sadhana - Academy Proceedings in Engineering Sciences |
| Volume | 27 |
| Issue number | PART 1 |
| DOIs | |
| Publication status | Published - 01-12-2002 |
All Science Journal Classification (ASJC) codes
- General
Fingerprint
Dive into the research topics of 'Neural network based system for script identification in Indian documents'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver