Topic: supervised learning - Searchworks@Jio Institute Digital Library Search Results

1. CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning

Author: Bernd Edler, Emanuel A. P. Habets, Fabian-Robert Stöter, Soumitro Chakrabarty, Scientific Data Management (ZENITH), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), International Audio Laboratories Erlangen (AUDIO LABS), Friedrich-Alexander Universität Erlangen-Nürnberg (FAU)-Fraunhofer Institute for Integrated Circuits (Fraunhofer IIS), Fraunhofer (Fraunhofer-Gesellschaft)-Fraunhofer (Fraunhofer-Gesellschaft), The authors gratefully acknowledge the compute resources and support provided by the Erlangen Regional Computing Center (RRZE). They would like to thank A. Liutkus for his constructive criticism of the paper., and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM)
Subjects: Speaker count estimation, Reverberation, cocktail-party, overlap detection, Acoustics and Ultrasonics, Artificial neural network, Computer science, Speech recognition, Supervised learning, Probabilistic logic, Blind signal separation, Speaker diarisation, 030507 speech-language pathology & audiology, 03 medical and health sciences, Computational Mathematics, [MATH.MATH-LO]Mathematics [math]/Logic [math.LO], Recurrent neural network, Computer Science (miscellaneous), number of concurrent speakers, Point estimation, Electrical and Electronic Engineering, 0305 other medical science
Abstract: International audience; Estimating the maximum number of concurrent speakers from single-channel mixtures is a challenging problem and an essential first step to address various audio-based tasks such as blind source separation, speaker diarization, and audio surveillance. We propose a unifying probabilistic paradigm, where deep neural network architectures are used to infer output posterior distributions. These probabilities are in turn processed to yield discrete point estimates. Designing such architectures often involves two important and complementary aspects that we investigate and discuss. First, we study how recent advances in deep architectures may be exploited for the task of speaker count estimation. In particular, we show that convolutional recurrent neural networks outperform recurrent networks used in a previous study when adequate input features are used. Even for short segments of speech mixtures, we can estimate up to five speakers, with a significantly lower error than other methods. Second, through comprehensive evaluation, we compare the best-performing method to several baselines, as well as the influence of gain variations, different data sets, and reverberation. The output of our proposed method is compared to human performance. Finally, we give insights into the strategy used by our proposed method.
Published: 2019

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

1 results

1. CountNet: Estimating the Number of Concurrent Speakers Using Supervised Learning

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Database

1 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources