AI-Based Handwritten Text Recognition and Smart Document Digitization System

Main Article Content

Vedant Chirmade
Shantanu Chavan
Manthan Gaikwad
Rahul Kumar

Abstract

Handwritten text recognition remains a challenging problem in document intelligence due to variations in writing style, including differences in stroke, spacing, and alignment. This paper presents an AI-Based Handwritten Text Recognition and Smart Document Digitization System that addresses these challenges using an end-to-end OCR pipeline. The system integrates image preprocessing, CNN-based feature extraction, Bi-LSTM sequence modeling, and CTC-based decoding to enable accurate recognition of handwritten text without explicit segmentation. The proposed model is trained and evaluated on the IAM dataset using standard metrics such as Character Error Rate (CER) and Word Error Rate (WER). Unlike conventional OCR systems, the proposed approach extends beyond recognition by incorporating a complete application pipeline, including a Tkinter-based GUI, Flask API, and export functionality to Word and PDF formats. Experimental results demonstrate a CER of 9.94% and WER of 29.83%, outperforming standard CRNN baselines while maintaining computational efficiency. The study also highlights the significant impact of preprocessing on recognition performance and the importance of statistical evaluation in validating model effectiveness. The system provides a practical and deployable solution for real-world handwritten document digitization.


 

Article Details

How to Cite
Chirmade, V., Chavan, S., Gaikwad, M., & Kumar, R. (2026). AI-Based Handwritten Text Recognition and Smart Document Digitization System. International Journal on Advanced Computer Theory and Engineering, 15(2S), 69–82. Retrieved from https://journals.mriindia.com/index.php/ijacte/article/view/2974
Section
Articles

Similar Articles

<< < 1 2 3 4 5 6 7 8 9 10 > >> 

You may also start an advanced similarity search for this article.