Stylometric Author Identification via CNN-BiLSTM Architecture on Syntactic Text Patterns

Shaik Mulli Shabeer; Koppula Lakshmi Keerthi; Dhulipala Prem Aditya; Karumanchi Priyanka; Lambu Abhinay

doi:10.65521/ijacect.v14i1.165

PDF

Published: Apr 14, 2025

DOI: https://doi.org/10.65521/ijacect.v14i1.165

Keywords:

Text Style Analysis, Reuters-50-50 Dataset, Syntactic Features, Text Classification, CNN-BiLSTM, Deep Learning, Authorship Attribution

Shaik Mulli Shabeer

Associate Professor ,Department of Computer Science & Engineering ,Chalapathi Institute of Engineering and Technology, LAM, Guntur, AP, India

Koppula Lakshmi Keerthi

Department of Computer Science and Engineering,Chalapathi Institute of Engineering and Technology, LAM, Guntur, AP, India

Dhulipala Prem Aditya

Department of Computer Science and Engineering,Chalapathi Institute of Engineering and Technology, LAM, Guntur, AP, India

Karumanchi Priyanka

Department of Computer Science and Engineering,Chalapathi Institute of Engineering and Technology, LAM, Guntur, AP, India

Lambu Abhinay

Department of Computer Science and Engineering,Chalapathi Institute of Engineering and Technology, LAM, Guntur, AP, India

Abstract

Authorship attribution is a critical task in natural language processing that involves identifying the author of a given text based on writing style, linguistic patterns, and structural features. This research presents a deep learning-based approach combining Convolutional Neural Networks (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) networks to accurately attribute authorship. Using the Reuters-50-50 dataset, we extract syntactic and structural information such as part-of-speech tags, punctuation frequency, and average sentence length, which help capture the unique stylistic traits of individual authors. The text is cleaned, transformed into numerical vectors, and used to train the proposed model. Experimental results demonstrate that the hybrid CNN-BiLSTM architecture achieves high accuracy of 96% in identifying authors from unseen text samples. The model also performs well across other metrics such as precision, recall, and F1-score, showing its robustness and effectiveness in capturing deep textual patterns. This work contributes to the fields of authorship verification, plagiarism detection, and digital forensics, offering a scalable and reliable solution for text-based author identification.

How to Cite

Shabeer, S. M., Keerthi , K. L., Aditya, D. P., Priyanka, K., & Abhinay, L. (2025). Stylometric Author Identification via CNN-BiLSTM Architecture on Syntactic Text Patterns. International Journal on Advanced Computer Engineering and Communication Technology, 14(1), 1–8. https://doi.org/10.65521/ijacect.v14i1.165

Issue

Vol. 14 No. 1 (2025)

Section

Articles

Article Sidebar

Main Article Content

Abstract

Article Details

Similar Articles