AI Automatic Pronunciation Mistake Detector

Main Article Content

Pratham Singh
Rishikesh Singh
Siddhi Sankhe
Narendra Prajapati
Manasi Churi

Abstract

Automated Pronunciation Evaluation plays a major role in computer-assisted learning for various languages, majorly used for learning English, and many other languages. However, effective multilingual systems for pronunciation assessment are not yet fully developed, particularly for Indic languages which have complex character and phonetic systems. Most pronunciation assessment systems utilize word-level scoring or limited acoustic models, which restrict the scope for phoneme-level assessment and accommodating linguistic diversity. Additionally, errors resulting from ASR systems affect the overall accuracy of the scoring process. This paper proposes a framework for phoneme-level pronunciation assessment system for English, Hindi, and Marathi languages. The system is developed by integrating the Whisper ASR model, word-level timestamp extraction, grapheme-to-phoneme conversion, Dynamic Time Warping for robust word alignment, and phoneme-level Levenshtein distance scoring. In addition, the schwa deletion module is included to handle Devanagari languages. The schwa deletion module is designed to eliminate the impact of schwa characters on pronunciation scores.


The framework is based on a modular three-tier structure that includes a browser-based audio capture interface, a Flask-based REST API backend, and an extensible AI processing core developed based on interface-driven model abstractions. Normalization and resampling of audio signals are performed before ASR inference to improve consistency across recording conditions, while DTW-based word alignment over a word distance matrix enhances robustness against ASR variability. The experimental results show stable word alignment against recognition noise and consistent accuracy discrimination for phoneme-level qualities across varying word pronunciation qualities. Word-level categorization and IPA visualization are provided for actionable feedback on pronunciation qualities for multilingual learning scenarios.


 

Article Details

How to Cite
Singh, P., Singh, R., Sankhe, S., Prajapati, N., & Churi, M. (2026). AI Automatic Pronunciation Mistake Detector. International Journal on Advanced Computer Theory and Engineering, 15(1), 110–124. Retrieved from https://journals.mriindia.com/index.php/ijacte/article/view/2626
Section
Articles

Most read articles by the same author(s)

Similar Articles

<< < 4 5 6 7 8 9 10 11 12 13 > >> 

You may also start an advanced similarity search for this article.