Spatio-Temporal Deepfake Detection: A CNN-RNN Hybrid Approach for Image and Video Forgery Identification

Main Article Content

Vidhina Bansod
Rudra Lagde
Devesh Khachane

Abstract

The proliferation of AI-generated synthetic media, commonly known as deepfakes, poses a grave threat to digital security, public trust, and information integrity. Despite the existence of numerous detection frameworks, many suffer from limited generalizability, lack of interpretability, and poor performance against high-quality forgeries. This paper presents a spatio-temporal deepfake detection system that integrates Convolutional Neural Networks (CNNs) for spatial artifact extraction with Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) units for temporal inconsistency modeling. The system employs MTCNN-based face extraction followed by VGG16/VGG19 backbone networks for per- frame analysis, and leverages the Deepfake Detection Challenge (DFDC) dataset with careful class balancing to overcome dataset bias and fake-accuracy pitfalls. Experimental results demonstrate that the hybrid spatio- temporal approach significantly outperforms single-modality baselines, achieving robust detection across varying lighting conditions, compression levels, and face resolutions. The proposed framework lays the foundation for a real-time multimodal deepfake authentication system integrating audio-visual synchronization analysis.

Article Details

How to Cite
Bansod, V., Lagde, R., & Khachane, D. (2026). Spatio-Temporal Deepfake Detection: A CNN-RNN Hybrid Approach for Image and Video Forgery Identification. International Journal on Advanced Computer Theory and Engineering, 15(2S), 145–151. Retrieved from https://journals.mriindia.com/index.php/ijacte/article/view/2984
Section
Articles

Similar Articles

<< < 3 4 5 6 7 8 9 10 11 12 > >> 

You may also start an advanced similarity search for this article.