1. Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing
- Author
-
Xin Wang, Kong Aik Lee, Massimiliano Todisco, Junichi Yamagishi, Andreas Nautsch, Md. Sahidullah, Héctor Delgado, Tomi Kinnunen, Nicholas Evans, University of Eastern Finland, Eurecom [Sophia Antipolis], Speech Modeling for Facilitating Oral-Based Communication (MULTISPEECH), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), National Institute of Informatics (NII), Nuance Communications [Spain], Institute for Infocomm Research (I2R), This work was supported by a number of projects and funding sources: VoicePersonae, supported by the French Agence Nationale de la Recherche (ANR) and the Japan Science and Technology Agency (JST) with grant No. JPMJCR18A6, Academy of Finland (proj. 309629), Region Grand Est, France., Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Sound (cs.SD) ,Computer science ,multi-dimensional scaling ,02 engineering and technology ,Classifier ,Statistics - Applications ,Computer Science - Sound ,Machine Learning (cs.LG) ,Set (abstract data type) ,030507 speech-language pathology & audiology ,03 medical and health sciences ,[INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG] ,[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing ,Audio and Speech Processing (eess.AS) ,Classifier (linguistics) ,0202 electrical engineering, electronic engineering, information engineering ,FOS: Electrical engineering, electronic engineering, information engineering ,Applications (stat.AP) ,Representation (mathematics) ,[STAT.AP]Statistics [stat]/Applications [stat.AP] ,Receiver operating characteristic ,business.industry ,Visual comparison ,020206 networking & telecommunications ,Pattern recognition ,Mixture model ,Automatic summarization ,Adjacency list ,Artificial intelligence ,0305 other medical science ,business ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Whether it be for results summarization, or the analysis of classifier fusion, some means to compare different classifiers can often provide illuminating insight into their behaviour, (dis)similarity or complementarity. We propose a simple method to derive 2D representation from detection scores produced by an arbitrary set of binary classifiers in response to a common dataset. Based upon rank correlations, our method facilitates a visual comparison of classifiers with arbitrary scores and with close relation to receiver operating characteristic (ROC) and detection error trade-off (DET) analyses. While the approach is fully versatile and can be applied to any detection task, we demonstrate the method using scores produced by automatic speaker verification and voice anti-spoofing systems. The former are produced by a Gaussian mixture model system trained with VoxCeleb data whereas the latter stem from submissions to the ASVspoof 2019 challenge., Comment: Accepted to Interspeech 2021. Example code available at https://github.com/asvspoof-challenge/classifier-adjacency
- Published
- 2021
- Full Text
- View/download PDF