A Dual-Stage Approach to Deepfake Video Detection Employing ResNet and LSTM Networks

Nithish Kumar S.; Akhila S.

Research Article

A Dual-Stage Approach to Deepfake Video Detection Employing ResNet and LSTM Networks

by Nithish Kumar S., Akhila S.

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 187 - Issue 42

Published: September 2025

Authors: Nithish Kumar S., Akhila S.

10.5120/ijca2025925739

PDF

Nithish Kumar S., Akhila S. . A Dual-Stage Approach to Deepfake Video Detection Employing ResNet and LSTM Networks. International Journal of Computer Applications. 187, 42 (September 2025), 46-53. DOI=10.5120/ijca2025925739

                        @article{ 10.5120/ijca2025925739,
                        author  = { Nithish Kumar S.,Akhila S. },
                        title   = { A Dual-Stage Approach to Deepfake Video Detection Employing ResNet and LSTM Networks },
                        journal = { International Journal of Computer Applications },
                        year    = { 2025 },
                        volume  = { 187 },
                        number  = { 42 },
                        pages   = { 46-53 },
                        doi     = { 10.5120/ijca2025925739 },
                        publisher = { Foundation of Computer Science (FCS), NY, USA }
                        }

                        %0 Journal Article
                        %D 2025
                        %A Nithish Kumar S.
                        %A Akhila S.
                        %T A Dual-Stage Approach to Deepfake Video Detection Employing ResNet and LSTM Networks%T 
                        %J International Journal of Computer Applications
                        %V 187
                        %N 42
                        %P 46-53
                        %R 10.5120/ijca2025925739
                        %I Foundation of Computer Science (FCS), NY, USA

Abstract

Deepfake boom has emerged as greatest multimedia information authenticity threats. In this paper, in anticipation of this issue, we propose an end-to-end detection synergistically merged Residual Networks (ResNet) for spatial feature learning and a combination of Long Short- Term Memory (LSTM) and Convolutional Neural Network (CNN) for temporal sequence modeling. ResNet module effectively outputs rich facial and contextual data from one frame, and Long Short-Term Memory- Convolutional Neural Networks (LSTM-CNN) module tracks temporal dynamics to capture unusual facial movements and expressions between two frames. For enhancing the model's ability to generalize, we utilize transfer learning practices such as large dataset pre- training and fine-tuning on deepfake-specialized datasets. Experimental tests conducted on certain deepfake datasets validate the enhanced performance of the introduced framework based on accuracy, precision, and recall in comparison to other dominant state-of-the-art methods. The result validates the robustness of the framework and its applicability in real scenarios, which largely contributes to multimedia forensics as well as the fight against false digital propaganda.

References

N. M. Alnaim, Z. M. Almutairi, and M. S. Alsuwat, “DFFMD: A Deepfake Face Mask Dataset for Infectious Disease Era with Deepfake Detection Algorithms,” 2023.
Z. Guo, G. Yang, J. Chen, and X. Sun, “Fake face detection via adaptive manipulation traces extraction network,” Comput. Vis. Image Underst., vol. 204, 2021.
N. Carlini and H. Farid, “Evading deep-fake-image detectors with white- and black-box attacks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR) Workshops, 2020.
J. Frank, T. Eisenhofer, L. Schönherr, A. Fischer, D. Kolossa, and T. Holz, “Leveraging frequency analysis for deep fake image recognition,” 2020.
S.-Y. Wang, O. Wang, R. Zhang, A. Owens, and A. A. Efros, “CNN-generated images are surprisingly easy to spot...for now,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020.
M. Masood, M. Nawaz, K. M. Malik, A. Javed, and A. Irtaza, “Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward,” arXiv preprint arXiv:2103.00484, 2021.
J. Pu, N. Mangaokar, L. Kelly, P. Bhattacharya, K. Sundaram, M. Javed, B. Wang, and B. Viswanath, “Deepfake videos in the wild: Analysis and detection,” arXiv preprint arXiv:2103.04263, 2021.
S. M. Zobaed, M. F. Rabby, M. I. Hossain, E. Hossain, S. Hasan, A. Karim, and K. M. Hasib, “DeepFakes: Detecting forged and synthetic media content using machine learning,” arXiv preprint arXiv:2109.02874, 2021.
S. Degadwala and V. M. Patel, “Advancements in deepfake detection: A review of emerging techniques and technologies,” ResearchGate, 2024.
Y. S. El-Din, M. N. Moustafa, and H. Mahdi, “Deep convolutional neural networks for face and iris presentation attack detection: Survey and case study,” IET Biometrics, 2020.
L. Lv et al., “Combining dynamic image and prediction ensemble for cross-domain face anti-spoofing,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), 2021.
B. Zhang, B. Tondi, and M. Barni, “Attacking CNN-based anti-spoofing face authentication in the physical domain,” in Proc. Int. Conf. Multidiscip. Eng. Appl. Sci. (ICMEAS), Abuja, Nigeria, 2023, pp. 1–6.
J. N. Kundu, N. Venkat, M. V. Rahul, and R. V. Babu, “Universal source-free domain adaptation,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), 2020.
U. A. Ciftci, I. Demir, and L. Yin, “FakeCatcher: Detection of synthetic portrait videos using biological signals,” IEEE Trans. Pattern Anal. Mach. Intell., early access, Jul. 15, 2020, doi: 10.1109/TPAMI.2020.3009287.
K. Roy et al., “Bi-FPNFAS: Bi-directional feature pyramid network for pixel-wise face anti-spoofing by leveraging Fourier spectra,” Sensors, vol. 21, no. 8, 2021.

Index Terms

Computer Science

Information Sciences

No index terms available.

Keywords

Deepfake Residual Networks Long Short-Term Memory Convolutional Neural Network