1. Anomaly Detection of Deepfake Audio Based on Real Audio Using Generative Adversarial Network Model
- Author
-
Daeun Song, Nayoung Lee, Jiwon Kim, and Eunjung Choi
- Subjects
Anomaly detection ,Deepfake ,Deepfake audio ,deep learning ,f-AnoGAN ,feature extraction ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Deepfake audio causes damage not only to individuals and companies, but also to nations; therefore, research on deepfake audio detection technology is crucial. Most existing deepfake audio detection research has been conducted using supervised learning; however, when a new synthetic deepfake audio emerges, real-time detection becomes difficult because of the limitations of supervised learning. Therefore, this paper proposes a new anomaly detection technique for identifying deep-fake audio using unsupervised learning. This method involves learning the feature distribution of numerous real human voices and then calculating an anomaly score for each voice to determine whether it is deepfake. In this study, we imaged speech using mel-spectrogram and mel-frequency cepstral coefficient (MFCC), which are speech preprocessing methods. Subsequently, the parameters of the GANomaly and f-AnoGAN models, which are effective in detecting abnormalities in speech, were tuned and subjected to unsupervised training. The most effective result had an F1-score of 0.93 in and was obtained by combining imaging speech with Mel-Spectrogram with training using the GANomaly model.
- Published
- 2024
- Full Text
- View/download PDF