Exploring Advancements in Multilingual OCR Systems for Enhanced Document Analysis and Text Recognition

  • Kinjal Patel*
  • , Deepak Parashar
  • , Nilesh Bahadure
  • , Bhoomi Shah
  • , Rohit Kumar
  • , Jagdish Chandra Patni
  • *Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Optical Character Recognition (OCR) systems use robust software for searching words from scanned multilingual Indian documents. Manually searching such documents is tedious and time- consuming. These documents suffer from their improper layout, and low print quality, and contain intermixed texts (Machine-printed and handwritten). OCR is used to detect text from images if the text is not visible then it detects the actual text and gives the visible text. The system improves text recognition accuracy and it takes less time to identify the original text. The system uses many algorithms or methods to perform these tasks like Convolutional Neural Network (CNN), Byte Pair Encoding (BPE), and Language model (LM). It gives experimental results that demonstrate significant advancement in text recognition performance and scalability. It offers a comprehensive solution for multilingual OCR tasks.

Original languageEnglish
Title of host publication2025 IEEE International Conference on Computer, Electronics, Electrical Engineering and their Applications, IC2E3 2025
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798331524395
DOIs
Publication statusPublished - 2025
Event2025 IEEE International Conference on Computer, Electronics, Electrical Engineering and their Applications, IC2E3 2025 - Srinagar Garhwal, India
Duration: 15-05-202516-05-2025

Publication series

Name2025 IEEE International Conference on Computer, Electronics, Electrical Engineering and their Applications, IC2E3 2025

Conference

Conference2025 IEEE International Conference on Computer, Electronics, Electrical Engineering and their Applications, IC2E3 2025
Country/TerritoryIndia
CitySrinagar Garhwal
Period15-05-2516-05-25

All Science Journal Classification (ASJC) codes

  • Information Systems and Management
  • Electrical and Electronic Engineering
  • Safety, Risk, Reliability and Quality
  • Hardware and Architecture
  • Artificial Intelligence
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Exploring Advancements in Multilingual OCR Systems for Enhanced Document Analysis and Text Recognition'. Together they form a unique fingerprint.

Cite this