A Comparative Study of Deepfake Detection Using ResNeXt50_32x4d + LSTM, EfficientNet + GRU, and Xception + Transformer Encoder
T S Harikrishnan
, Sebin Thomas , Sanjo Johny , Neha Simon , Jintu Ann John
Deepfake Detection, EfficientNet, GRU, LSTM, ResNeXt50_34x4d, Transformer Encoder, Xception
Deepfake videos have emerged as a significant threat to digital media authentication due to their ability to convincingly alter video content, leading to widespread misinformation and manipulation. This paper presents a comparative study of three advanced deepfake detection models: ResNeXt50_32x4d + LSTM, EfficientNet + GRU, and Xception + Transformer Encoder. The ResNeXt50_32x4d + LSTM model utilizes a hybrid spatial-temporal approach, combining ResNeXt50_32x4d for spatial feature extraction and LSTM for temporal feature modeling, which significantly enhances its ability to detect subtle manipulations across video frames. In contrast, EfficientNet + GRU focuses on computational efficiency with a streamlined architecture, while Xception + Transformer Encoder employs attention mechanisms for long-range dependency analysis in video sequences. The study demonstrates that the ResNeXt50_32x4d + LSTM model consistently outperforms the other two models in terms of accuracy, precision, recall, and computational efficiency. By leveraging transfer learning and a well-structured preprocessing pipeline, ResNeXt50_32x4d + LSTM achieves a higher detection rate by capturing intricate spatial patterns and subtle temporal inconsistencies across frames, making it particularly robust in identifying both real and fake videos. The experimental results show that the ResNeXt50_32x4d + LSTM model achieves an accuracy of 91.88%, a precision of 90.66%, and a recall of 85.60%, surpassing the performance of both EfficientNet + GRU and Xception + Transformer Encoder in terms of deepfake detection. These results establish ResNeXt50_32x4d + LSTM as a superior method for tackling the challenges posed by deepfake technology. The paper concludes by analyzing the advantages, limitations, and trade-offs between these models, suggesting that ResNeXt50_32x4d + LSTM is an optimal choice for real-time deepfake detection applications due to its balanced trade-off between accuracy and computational cost.
"A Comparative Study of Deepfake Detection Using ResNeXt50_32x4d + LSTM, EfficientNet + GRU, and Xception + Transformer Encoder ", IJSDR - International Journal of Scientific Development and Research (www.IJSDR.org), ISSN:2455-2631, Vol.10, Issue 3, page no.b306-b314, March-2025, Available :https://ijsdr.org/papers/IJSDR2503135.pdf
Volume 10
Issue 3,
March-2025
Pages : b306-b314
Paper Reg. ID: IJSDR_300886
Published Paper Id: IJSDR2503135
Downloads: 000256
Research Area: Science and Technology
Country: Kottayam, Kerala, India
ISSN: 2455-2631 | IMPACT FACTOR: 9.15 Calculated By Google Scholar | ESTD YEAR: 2016
An International Scholarly Open Access Journal, Peer-Reviewed, Refereed Journal Impact Factor 9.15 Calculate by Google Scholar and Semantic Scholar | AI-Powered Research Tool, Multidisciplinary, Monthly, Multilanguage Journal Indexing in All Major Database & Metadata, Citation Generator
Publisher: IJSDR(IJ Publication) Janvi Wave