1. Tandem Assessment of Spoofing Countermeasures and Automatic Speaker Verification: Fundamentals
- Author
-
Kong Aik Lee, Massimiliano Todisco, Héctor Delgado, Xin Wang, Tomi Kinnunen, Ville Vestman, Nicholas Evans, Douglas A. Reynolds, Sahidullah, Junichi Yamagishi, Andreas Nautsch, University of Eastern Finland, Eurecom [Sophia Antipolis], NEC Corporation, National Institute of Informatics (NII), Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), MIT Lincoln Laboratory, Massachusetts Institute of Technology (MIT), This work has been sponsored by Academy of Finland (proj. no. 309629), Japan Science and Technology (JST), and the Department of Defense under Air Force Contract FA8721-05-C-0002. Opinions, interpretations, conclusions and recommendations are those of the authors and are not necessarily endorsed by the United States Government. The work has also been partially funded by the EU H2020 research and innovation programme under the MSCA grant agreement No. 860813 (TReSPAsS-ETN), the ANR-DFG FrenchGerman joint project ANR-18-CE92-0024 (RESPECT), the ANR project ANR-19-DATA-0008 (HARPOCRATES), the ANR project ExTENSoR and the JST-ANR Japanese-French project VoicePersonae., ANR-18-CE92-0024,RESPECT,Authentification multi-biométrique des personnes, fiable, sécurisée et préservant la vie privée(2018), and ANR-19-DATA-0008,Harpocrates,Open data, outils et challenges pour l'anonymisation des voix(2019)
- Subjects
Signal Processing (eess.SP) ,FOS: Computer and information sciences ,Sound (cs.SD) ,Computer Science - Machine Learning ,Spoofing attack ,spoofing countermeasures ,Acoustics and Ultrasonics ,Computer science ,media_common.quotation_subject ,Word error rate ,Machine learning ,computer.software_genre ,Computer Science - Sound ,Electronic mail ,[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI] ,Machine Learning (cs.LG) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,presentation attack detection ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,[STAT.ML]Statistics [stat]/Machine Learning [stat.ML] ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,Audio and Speech Processing (eess.AS) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,Electrical Engineering and Systems Science - Signal Processing ,Electrical and Electronic Engineering ,Special case ,Function (engineering) ,Reliability (statistics) ,media_common ,[SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph] ,[STAT.AP]Statistics [stat]/Applications [stat.AP] ,business.industry ,[INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV] ,detection cost function ,Speech processing ,Computational Mathematics ,Metric (unit) ,Artificial intelligence ,0305 other medical science ,business ,[SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing ,computer ,automatic speaker verification ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Recent years have seen growing efforts to develop spoofing countermeasures (CMs) to protect automatic speaker verification (ASV) systems from being deceived by manipulated or artificial inputs. The reliability of spoofing CMs is typically gauged using the equal error rate (EER) metric. The primitive EER fails to reflect application requirements and the impact of spoofing and CMs upon ASV and its use as a primary metric in traditional ASV research has long been abandoned in favour of risk-based approaches to assessment. This paper presents several new extensions to the tandem detection cost function (t-DCF), a recent risk-based approach to assess the reliability of spoofing CMs deployed in tandem with an ASV system. Extensions include a simplified version of the t-DCF with fewer parameters, an analysis of a special case for a fixed ASV system, simulations which give original insights into its interpretation and new analyses using the ASVspoof 2019 database. It is hoped that adoption of the t-DCF for the CM assessment will help to foster closer collaboration between the anti-spoofing and ASV research communities., Published in IEEE/ACM Transactions on Audio, Speech, and Language Processing (doi updated)
- Published
- 2020
- Full Text
- View/download PDF