1. DFCon: Attention-Driven Supervised Contrastive Learning for Robust Deepfake Detection
- Author
-
Shanto, MD Sadik Hossain, Dihan, Mahir Labib, Ghosh, Souvik, Anonto, Riad Ahmed, Chowdhury, Hafijul Hoque, Muhtasim, Abir, Ahsan, Rakib, Hassan, MD Tanvir, Sojib, MD Roqunuzzaman, Hakim, Sheikh Azizul, and Rahman, M. Saifur
- Subjects
Computer Science - Computer Vision and Pattern Recognition ,Computer Science - Cryptography and Security ,Computer Science - Machine Learning ,Electrical Engineering and Systems Science - Image and Video Processing ,Electrical Engineering and Systems Science - Signal Processing - Abstract
This report presents our approach for the IEEE SP Cup 2025: Deepfake Face Detection in the Wild (DFWild-Cup), focusing on detecting deepfakes across diverse datasets. Our methodology employs advanced backbone models, including MaxViT, CoAtNet, and EVA-02, fine-tuned using supervised contrastive loss to enhance feature separation. These models were specifically chosen for their complementary strengths. Integration of convolution layers and strided attention in MaxViT is well-suited for detecting local features. In contrast, hybrid use of convolution and attention mechanisms in CoAtNet effectively captures multi-scale features. Robust pretraining with masked image modeling of EVA-02 excels at capturing global features. After training, we freeze the parameters of these models and train the classification heads. Finally, a majority voting ensemble is employed to combine the predictions from these models, improving robustness and generalization to unseen scenarios. The proposed system addresses the challenges of detecting deepfakes in real-world conditions and achieves a commendable accuracy of 95.83% on the validation dataset., Comment: Technical report for IEEE Signal Processing Cup 2025, 7 pages
- Published
- 2025