Abstract

As the increasing number of deepfake content poses a growing threat to multimedia integrity, this paper proposes a robust deepfake detection approach based on a hybrid architecture. The proposed framework combines the power of Residual Networks (ResNet) for spatial feature extraction and Long Short-Term Memory (LSTM) with Convolutional Neural Networks (CNN) for temporal dependency modeling. The ResNet component captures complicated patterns in facial and contextual information, whereas the LSTM-CNN module identifies dynamic facial expressions and movements across multiple frames. Transfer learning strategies are used to improve model generalization by combining pre-training on a large dataset with fine-tuning on deepfake data. Experimental evaluations on a variety of deepfake datasets show superior accuracy, precision, and recall, demonstrating the hybrid architecture's efficiency in dealing with the evolving challenges posed by more advanced deepfake generation techniques. In conclusion, our novel deepfake detection framework employs an effective combination of ResNet and LSTM-CNN, demonstrating a promising solution that not only advances the state-of-the-art in multimedia forensics but is also resistant to adversarial attacks. This hybrid model, which effectively combines spatial and temporal information, has the potential to significantly improve the accuracy and reliability of deepfake detection systems in the face of emerging digital threats.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call