Author: "Diez, Mireia" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Diez, Mireia"' showing total 89 results

Start Over Author "Diez, Mireia"

89 results on '"Diez, Mireia"'

1. Joint Training of Speaker Embedding Extractor, Speech and Overlap Detection for Diarization

Author: Pálka, Petr, Landini, Federico, Klement, Dominik, Diez, Mireia, Silnova, Anna, Delcroix, Marc, and Burget, Lukáš
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: In spite of the popularity of end-to-end diarization systems nowadays, modular systems comprised of voice activity detection (VAD), speaker embedding extraction plus clustering, and overlapped speech detection (OSD) plus handling still attain competitive performance in many conditions. However, one of the main drawbacks of modular systems is the need to run (and train) different modules independently. In this work, we propose an approach to jointly train a model to produce speaker embeddings, VAD and OSD simultaneously and reach competitive performance at a fraction of the inference time of a standard approach. Furthermore, the joint inference leads to a simplified overall pipeline which brings us one step closer to a unified clustering-based method that can be trained end-to-end towards a diarization-specific objective.
Published: 2024

2. Leveraging Self-Supervised Learning for Speaker Diarization

Author: Han, Jiangyu, Landini, Federico, Rohdin, Johan, Silnova, Anna, Diez, Mireia, and Burget, Lukas
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: End-to-end neural diarization has evolved considerably over the past few years, but data scarcity is still a major obstacle for further improvements. Self-supervised learning methods such as WavLM have shown promising performance on several downstream tasks, but their application on speaker diarization is somehow limited. In this work, we explore using WavLM to alleviate the problem of data scarcity for neural diarization training. We use the same pipeline as Pyannote and improve the local end-to-end neural diarization with WavLM and Conformer. Experiments on far-field AMI, AISHELL-4, and AliMeeting datasets show that our method substantially outperforms the Pyannote baseline and achieves new state-of-the-art results on AMI and AISHELL-4, respectively. In addition, by analyzing the system performance under different data quantity scenarios, we show that WavLM representations are much more robust against data scarcity than filterbank features, enabling less data hungry training strategies. Furthermore, we found that simulated data, usually used to train endto-end diarization models, does not help when using WavLM in our experiments. Additionally, we also evaluate our model on the recent CHiME8 NOTSOFAR-1 task where it achieves better performance than the Pyannote baseline. Our source code is publicly available at https://github.com/BUTSpeechFIT/DiariZen., Comment: Submitted to ICASSP 2025; New results are updated but conclusions are exactly the same as the original one
Published: 2024

3. Spoof Diarization: 'What Spoofed When' in Partially Spoofed Audio

Author: Zhang, Lin, Wang, Xin, Cooper, Erica, Diez, Mireia, Landini, Federico, Evans, Nicholas, and Yamagishi, Junichi
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Computation and Language, Computer Science - Sound
Abstract: This paper defines Spoof Diarization as a novel task in the Partial Spoof (PS) scenario. It aims to determine what spoofed when, which includes not only locating spoof regions but also clustering them according to different spoofing methods. As a pioneering study in spoof diarization, we focus on defining the task, establishing evaluation metrics, and proposing a benchmark model, namely the Countermeasure-Condition Clustering (3C) model. Utilizing this model, we first explore how to effectively train countermeasures to support spoof diarization using three labeling schemes. We then utilize spoof localization predictions to enhance the diarization performance. This first study reveals the high complexity of the task, even in restricted scenarios where only a single speaker per audio file and an oracle number of spoofing methods are considered. Our code is available at https://github.com/nii-yamagishilab/PartialSpoof., Comment: Accepted to Interspeech 2024
Published: 2024

4. Do End-to-End Neural Diarization Attractors Need to Encode Speaker Characteristic Information?

Author: Zhang, Lin, Stafylakis, Themos, Landini, Federico, Diez, Mireia, Silnova, Anna, and Burget, Lukáš
Subjects: Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: In this paper, we apply the variational information bottleneck approach to end-to-end neural diarization with encoder-decoder attractors (EEND-EDA). This allows us to investigate what information is essential for the model. EEND-EDA utilizes attractors, vector representations of speakers in a conversation. Our analysis shows that, attractors do not necessarily have to contain speaker characteristic information. On the other hand, giving the attractors more freedom to allow them to encode some extra (possibly speaker-specific) information leads to small but consistent diarization performance improvements. Despite architectural differences in EEND systems, the notion of attractors and frame embeddings is common to most of them and not specific to EEND-EDA. We believe that the main conclusions of this work can apply to other variants of EEND. Thus, we hope this paper will be a valuable contribution to guide the community to make more informed decisions when designing new systems., Comment: Accepted to Odyssey 2024. This arXiv version includes an appendix for more visualizations. Code: https://github.com/BUTSpeechFIT/EENDEDA_VIB
Published: 2024

5. DiaPer: End-to-End Neural Diarization with Perceiver-Based Attractors

Author: Landini, Federico, Diez, Mireia, Stafylakis, Themos, and Burget, Lukáš
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly regarding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most successful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-based one and show its advantages over EEND-EDA; namely obtaining better performance on the largely studied Callhome dataset, finding the quantity of speakers in a conversation more accurately, and faster inference time. Furthermore, when exhaustively compared with other methods, our model, DiaPer, reaches remarkable performance with a very lightweight design. Besides, we perform comparisons with other works and a cascaded baseline across more than ten public wide-band datasets. Together with this publication, we release the code of DiaPer as well as models trained on public and free data., Comment: Accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing
Published: 2023

6. Discriminative Training of VBx Diarization

Author: Klement, Dominik, Diez, Mireia, Landini, Federico, Burget, Lukáš, Silnova, Anna, Delcroix, Marc, and Tawara, Naohiro
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Bayesian HMM clustering of x-vector sequences (VBx) has become a widely adopted diarization baseline model in publications and challenges. It uses an HMM to model speaker turns, a generatively trained probabilistic linear discriminant analysis (PLDA) for speaker distribution modeling, and Bayesian inference to estimate the assignment of x-vectors to speakers. This paper presents a new framework for updating the VBx parameters using discriminative training, which directly optimizes a predefined loss. We also propose a new loss that better correlates with the diarization error rate compared to binary cross-entropy $\unicode{x2013}$ the default choice for diarization end-to-end systems. Proof-of-concept results across three datasets (AMI, CALLHOME, and DIHARD II) demonstrate the method's capability of automatically finding hyperparameters, achieving comparable performance to those found by extensive grid search, which typically requires additional hyperparameter behavior knowledge. Moreover, we show that discriminative fine-tuning of PLDA can further improve the model's performance. We release the source code with this publication., Comment: Submitted to ICASSP 2024
Published: 2023

7. DiaCorrect: Error Correction Back-end For Speaker Diarization

Author: Han, Jiangyu, Landini, Federico, Rohdin, Johan, Diez, Mireia, Burget, Lukas, Cao, Yuhang, Lu, Heng, and Cernocky, Jan
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Computation and Language, Computer Science - Sound
Abstract: In this work, we propose an error correction framework, named DiaCorrect, to refine the output of a diarization system in a simple yet effective way. This method is inspired by error correction techniques in automatic speech recognition. Our model consists of two parallel convolutional encoders and a transform-based decoder. By exploiting the interactions between the input recording and the initial system's outputs, DiaCorrect can automatically correct the initial speaker activities to minimize the diarization errors. Experiments on 2-speaker telephony data show that the proposed DiaCorrect can effectively improve the initial model's results. Our source code is publicly available at https://github.com/BUTSpeechFIT/diacorrect., Comment: Submitted to ICASSP 2024
Published: 2023

8. Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

Author: Delcroix, Marc, Tawara, Naohiro, Diez, Mireia, Landini, Federico, Silnova, Anna, Ogawa, Atsunori, Nakatani, Tomohiro, Burget, Lukas, and Araki, Shoko
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Combining end-to-end neural speaker diarization (EEND) with vector clustering (VC), known as EEND-VC, has gained interest for leveraging the strengths of both methods. EEND-VC estimates activities and speaker embeddings for all speakers within an audio chunk and uses VC to associate these activities with speaker identities across different chunks. EEND-VC generates thus multiple streams of embeddings, one for each speaker in a chunk. We can cluster these embeddings using constrained agglomerative hierarchical clustering (cAHC), ensuring embeddings from the same chunk belong to different clusters. This paper introduces an alternative clustering approach, a multi-stream extension of the successful Bayesian HMM clustering of x-vectors (VBx), called MS-VBx. Experiments on three datasets demonstrate that MS-VBx outperforms cAHC in diarization and speaker counting performance., Comment: Accepted at Interspeech 2023
Published: 2023

9. Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization

Author: Landini, Federico, Diez, Mireia, Lozano-Diez, Alicia, and Burget, Lukáš
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: End-to-end diarization presents an attractive alternative to standard cascaded diarization systems because a single system can handle all aspects of the task at once. Many flavors of end-to-end models have been proposed but all of them require (so far non-existing) large amounts of annotated data for training. The compromise solution consists in generating synthetic data and the recently proposed simulated conversations (SC) have shown remarkable improvements over the original simulated mixtures (SM). In this work, we create SC with multiple speakers per conversation and show that they allow for substantially better performance than SM, also reducing the dependence on a fine-tuning stage. We also create SC with wide-band public audio sources and present an analysis on several evaluation sets. Together with this publication, we release the recipes for generating such data and models trained on public sets as well as the implementation to efficiently handle multiple speakers per conversation and an auxiliary voice activity detection loss., Comment: Accepted by ICASSP 2023
Published: 2022

10. From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization

Author: Landini, Federico, Lozano-Diez, Alicia, Diez, Mireia, and Burget, Lukáš
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: End-to-end neural diarization (EEND) is nowadays one of the most prominent research topics in speaker diarization. EEND presents an attractive alternative to standard cascaded diarization systems since a single system is trained at once to deal with the whole diarization problem. Several EEND variants and approaches are being proposed, however, all these models require large amounts of annotated data for training but available annotated data are scarce. Thus, EEND works have used mostly simulated mixtures for training. However, simulated mixtures do not resemble real conversations in many aspects. In this work we present an alternative method for creating synthetic conversations that resemble real ones by using statistics about distributions of pauses and overlaps estimated on genuine conversations. Furthermore, we analyze the effect of the source of the statistics, different augmentations and amounts of data. We demonstrate that our approach performs substantially better than the original one, while reducing the dependence on the fine-tuning stage. Experiments are carried out on 2-speaker telephone conversations of Callhome and DIHARD 3. Together with this publication, we release our implementations of EEND and the method for creating simulated conversations., Comment: Accepted at Interspeech 2022
Published: 2022

11. Speaker adaptation for Wav2vec2 based dysarthric ASR

Author: Baskar, Murali Karthick, Herzig, Tim, Nguyen, Diana, Diez, Mireia, Polzehl, Tim, Burget, Lukáš, and Černocký, Jan "Honza''
Subjects: Computer Science - Sound, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Dysarthric speech recognition has posed major challenges due to lack of training data and heavy mismatch in speaker characteristics. Recent ASR systems have benefited from readily available pretrained models such as wav2vec2 to improve the recognition performance. Speaker adaptation using fMLLR and xvectors have provided major gains for dysarthric speech with very little adaptation data. However, integration of wav2vec2 with fMLLR features or xvectors during wav2vec2 finetuning is yet to be explored. In this work, we propose a simple adaptation network for fine-tuning wav2vec2 using fMLLR features. The adaptation network is also flexible to handle other speaker adaptive features such as xvectors. Experimental analysis show steady improvements using our proposed approach across all impairment severity levels and attains 57.72\% WER for high severity in UASpeech dataset. We also performed experiments on German dataset to substantiate the consistency of our proposed approach across diverse domains., Comment: Submitted to INTERSPEECH 2022
Published: 2022

12. Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: theory, implementation and analysis on standard tasks

Author: Landini, Federico, Profant, Ján, Diez, Mireia, and Burget, Lukáš
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: The recently proposed VBx diarization method uses a Bayesian hidden Markov model to find speaker clusters in a sequence of x-vectors. In this work we perform an extensive comparison of performance of the VBx diarization with other approaches in the literature and we show that VBx achieves superior performance on three of the most popular datasets for evaluating diarization: CALLHOME, AMI and DIHARDII datasets. Further, we present for the first time the derivation and update formulae for the VBx model, focusing on the efficiency and simplicity of this model as compared to the previous and more complex BHMM model working on frame-by-frame standard Cepstral features. Together with this publication, we release the recipe for training the x-vector extractors used in our experiments on both wide and narrowband data, and the VBx recipes that attain state-of-the-art performance on all three datasets. Besides, we point out the lack of a standardized evaluation protocol for AMI dataset and we propose a new protocol for both Beamformed and Mix-Headset audios based on the official AMI partitions and transcriptions., Comment: Submitted to Computer Speech and Language, Special Issue on Separation, Recognition, and Diarization of Conversational Speech
Published: 2020

13. Analysis of the BUT Diarization System for VoxConverse Challenge

Author: Landini, Federico, Glembek, Ondřej, Matějka, Pavel, Rohdin, Johan, Burget, Lukáš, Diez, Mireia, and Silnova, Anna
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: This paper describes the system developed by the BUT team for the fourth track of the VoxCeleb Speaker Recognition Challenge, focusing on diarization on the VoxConverse dataset. The system consists of signal pre-processing, voice activity detection, speaker embedding extraction, an initial agglomerative hierarchical clustering followed by diarization using a Bayesian hidden Markov model, a reclustering step based on per-speaker global embeddings and overlapped speech detection and handling. We provide comparisons for each of the steps and share the implementation of the most relevant modules of our system. Our system scored second in the challenge in terms of the primary metric (diarization error rate) and first according to the secondary metric (Jaccard error rate)., Comment: Accepted to ICASSP 2021
Published: 2020

14. BUT System for the Second DIHARD Speech Diarization Challenge

Author: Landini, Federico, Wang, Shuai, Diez, Mireia, Burget, Lukáš, Matějka, Pavel, Žmolíková, Kateřina, Mošner, Ladislav, Silnova, Anna, Plchot, Oldřich, Novotný, Ondřej, Zeinali, Hossein, and Rohdin, Johan
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper describes the winning systems developed by the BUT team for the four tracks of the Second DIHARD Speech Diarization Challenge. For tracks 1 and 2 the systems were mainly based on performing agglomerative hierarchical clustering (AHC) of x-vectors, followed by another x-vector clustering based on Bayes hidden Markov model and variational Bayes inference. We provide a comparison of the improvement given by each step and share the implementation of the core of the system. For tracks 3 and 4 with recordings from the Fifth CHiME Challenge, we explored different approaches for doing multi-channel diarization and our best performance was obtained when applying AHC on the fusion of per channel probabilistic linear discriminant analysis scores.
Published: 2020

15. BUT System Description for DIHARD Speech Diarization Challenge 2019

Author: Landini, Federico, Wang, Shuai, Diez, Mireia, Burget, Lukáš, Matějka, Pavel, Žmolíková, Kateřina, Mošner, Ladislav, Plchot, Oldřich, Novotný, Ondřej, Zeinali, Hossein, and Rohdin, Johan
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: This paper describes the systems developed by the BUT team for the four tracks of the second DIHARD speech diarization challenge. For tracks 1 and 2 the systems were based on performing agglomerative hierarchical clustering (AHC) over x-vectors, followed by the Bayesian Hidden Markov Model (HMM) with eigenvoice priors applied at x-vector level followed by the same approach applied at frame level. For tracks 3 and 4, the systems were based on performing AHC using x-vectors extracted on all channels.
Published: 2019

16. End-to-end DNN Based Speaker Recognition Inspired by i-vector and PLDA

Author: Rohdin, Johan, Silnova, Anna, Diez, Mireia, Plchot, Oldrich, Matejka, Pavel, and Burget, Lukas
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Sound
Abstract: Recently several end-to-end speaker verification systems based on deep neural networks (DNNs) have been proposed. These systems have been proven to be competitive for text-dependent tasks as well as for text-independent tasks with short utterances. However, for text-independent tasks with longer utterances, end-to-end systems are still outperformed by standard i-vector + PLDA systems. In this work, we develop an end-to-end speaker verification system that is initialized to mimic an i-vector + PLDA baseline. The system is then further trained in an end-to-end manner but regularized so that it does not deviate too far from the initial system. In this way we mitigate overfitting which normally limits the performance of end-to-end systems. The proposed system outperforms the i-vector + PLDA baseline on both long and short duration utterances.
Published: 2017

17. Bayesian HMM clustering of x-vector sequences (VBx) in speaker diarization: Theory, implementation and analysis on standard tasks

Author: Landini, Federico, Profant, Ján, Diez, Mireia, and Burget, Lukáš
Published: 2022
Full Text: View/download PDF

18. Discriminative Training of VBx Diarization

Author: Klement, Dominik, primary, Diez, Mireia, additional, Landini, Federico, additional, Burget, Lukáš, additional, Silnova, Anna, additional, Delcroix, Marc, additional, and Tawara, Naohiro, additional
Published: 2024
Full Text: View/download PDF

19. Diacorrect: Error Correction Back-End for Speaker Diarization

Author: Han, Jiangyu, primary, Landini, Federico, additional, Rohdin, Johan, additional, Diez, Mireia, additional, Burget, Lukáš, additional, Cao, Yuhang, additional, Lu, Heng, additional, and Černocký, Jan, additional
Published: 2024
Full Text: View/download PDF

20. 13 years of speaker recognition research at BUT, with longitudinal analysis of NIST SRE

Author: Matějka, Pavel, Plchot, Oldřich, Glembek, Ondřej, Burget, Lukáš, Rohdin, Johan, Zeinali, Hossein, Mošner, Ladislav, Silnova, Anna, Novotný, Ondřej, Diez, Mireia, and “Honza” Černocký, Jan
Published: 2020
Full Text: View/download PDF

21. End-to-end DNN based text-independent speaker recognition for long and short utterances

Author: Rohdin, Johan, Silnova, Anna, Diez, Mireia, Plchot, Oldřich, Matějka, Pavel, Burget, Lukáš, and Glembek, Ondřej
Published: 2020
Full Text: View/download PDF

22. Multi-Stream Extension of Variational Bayesian HMM Clustering (MS-VBx) for Combined End-to-End and Vector Clustering-based Diarization

Author: Delcroix, Marc, primary, Tawara, Naohiro, additional, Diez, Mireia, additional, Landini, Federico, additional, Silnova, Anna, additional, Ogawa, Atsunori, additional, Nakatani, Tomohiro, additional, Burget, Lukáš, additional, and Araki, Shoko, additional
Published: 2023
Full Text: View/download PDF

23. Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization

Author: Landini, Federico, primary, Diez, Mireia, additional, Lozano-Diez, Alicia, additional, and Burget, Lukáš, additional
Published: 2023
Full Text: View/download PDF

24. KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios

Author: Rodríguez-Fuentes, Luis Javier, Penagarikano, Mikel, Varona, Amparo, Diez, Mireia, and Bordel, Germán
Published: 2016

25. DiaPer: End-to-End Neural Diarization With Perceiver-Based Attractors

Author: Landini, Federico, Diez, Mireia, Stafylakis, Themos, and Burget, Lukas
Abstract: Until recently, the field of speaker diarization was dominated by cascaded systems. Due to their limitations, mainly regarding overlapped speech and cumbersome pipelines, end-to-end models have gained great popularity lately. One of the most successful models is end-to-end neural diarization with encoder-decoder based attractors (EEND-EDA). In this work, we replace the EDA module with a Perceiver-based one and show its advantages over EEND-EDA; namely obtaining better performance on the largely studied Callhome dataset, finding the quantity of speakers in a conversation more accurately, and faster inference time. Furthermore, when exhaustively compared with other methods, our model, DiaPer, reaches remarkable performance with a very lightweight design. Besides, we perform comparisons with other works and a cascaded baseline across more than ten public wide-band datasets. Together with this publication, we release the code of DiaPer as well as models trained on public and free data.
Published: 2024
Full Text: View/download PDF

26. On the Use of Dot Scoring for Speaker Diarization

Author: Diez, Mireia, Penagarikano, Mikel, Varona, Amparo, Rodriguez-Fuentes, Luis Javier, Bordel, German, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Nierstrasz, Oscar, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Sudan, Madhu, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Vardi, Moshe Y., Series editor, Weikum, Gerhard, Series editor, Vitrià, Jordi, editor, Sanches, João Miguel, editor, and Hernández, Mario, editor
Published: 2011
Full Text: View/download PDF

27. BCN2BRNO: ASR System Fusion for Albayzin 2022 Speech to Text Challenge

Author: Kocour, Martin, primary, Umesh, Jahnavi, additional, Karafiat, Martin, additional, Švec, Ján, additional, López, Fernando, additional, Luque, Jordi, additional, Beneš, Karel, additional, Diez, Mireia, additional, Szoke, Igor, additional, Veselý, Karel, additional, Burget, Lukáš, additional, and Černocký, Jan, additional
Published: 2022
Full Text: View/download PDF

28. From Simulated Mixtures to Simulated Conversations as Training Data for End-to-End Neural Diarization

Author: Landini, Federico, primary, Lozano-Diez, Alicia, additional, Diez, Mireia, additional, and Burget, Lukáš, additional
Published: 2022
Full Text: View/download PDF

29. Speaker adaptation for Wav2vec2 based dysarthric ASR

Author: Baskar, Murali Karthick, primary, Herzig, Tim, additional, Nguyen, Diana, additional, Diez, Mireia, additional, Polzehl, Tim, additional, Burget, Lukas, additional, and Černocký, Jan, additional
Published: 2022
Full Text: View/download PDF

30. Analysis of the but Diarization System for Voxconverse Challenge

Author: Landini, Federico, primary, Glembek, Ondrej, additional, Matejka, Pavel, additional, Rohdin, Johan, additional, Burget, Lukas, additional, Diez, Mireia, additional, and Silnova, Anna, additional
Published: 2021
Full Text: View/download PDF

31. On the Use of Dot Scoring for Speaker Diarization

Author: Diez, Mireia, primary, Penagarikano, Mikel, additional, Varona, Amparo, additional, Rodriguez-Fuentes, Luis Javier, additional, and Bordel, German, additional
Published: 2011
Full Text: View/download PDF

32. But System for the Second Dihard Speech Diarization Challenge

Author: Landini, Federico, primary, Wang, Shuai, additional, Diez, Mireia, additional, Burget, Lukas, additional, Matejka, Pavel, additional, Zmolikova, Katerina, additional, Mosner, Ladislav, additional, Silnova, Anna, additional, Plchot, Oldrich, additional, Novotny, Ondrej, additional, Zeinali, Hossein, additional, and Rohdin, Johan, additional
Published: 2020
Full Text: View/download PDF

33. Optimizing Bayesian Hmm Based X-Vector Clustering for the Second Dihard Speech Diarization Challenge

Author: Diez, Mireia, primary, Burget, Lukas, additional, Landini, Federico, additional, Wang, Shuai, additional, and Cernocky, Honza, additional
Published: 2020
Full Text: View/download PDF

34. Analysis of Speaker Diarization Based on Bayesian HMM With Eigenvoice Priors

Author: Diez, Mireia, primary, Burget, Lukas, additional, Landini, Federico, additional, and Cernocky, Jan, additional
Published: 2020
Full Text: View/download PDF

35. Bayesian HMM Based x-Vector Clustering for Speaker Diarization

Author: Diez, Mireia, primary, Burget, Lukáš, additional, Wang, Shuai, additional, Rohdin, Johan, additional, and Černocký, Jan, additional
Published: 2019
Full Text: View/download PDF

36. BUT System for DIHARD Speech Diarization Challenge 2018

Author: Diez, Mireia, primary, Landini, Federico, additional, Burget, Lukáš, additional, Rohdin, Johan, additional, Silnova, Anna, additional, Žmolíková, Kateřina, additional, Novotný, Ondřej, additional, Veselý, Karel, additional, Glembek, Ondřej, additional, Plchot, Oldřich, additional, Mošner, Ladislav, additional, and Matějka, Pavel, additional
Published: 2018
Full Text: View/download PDF

37. Speaker Diarization based on Bayesian HMM with Eigenvoice Priors

Author: Diez, Mireia, primary, Burget, Lukas, additional, and Matejka, Pavel, additional
Published: 2018
Full Text: View/download PDF

38. Analysis of BUT-PT Submission for NIST LRE 2017

Author: Plchot, Oldřich, primary, Matějka, Pavel, additional, Novotný, Ondřej, additional, Cumani, Sandro, additional, Lozano-Diez, Alicia, additional, Slavíček, Josef, additional, Diez, Mireia, additional, Grézl, František, additional, Glembek, Ondřej, additional, Kamsali, Mounika, additional, Silnova, Anna, additional, Burget, Lukáš, additional, Ondel, Lucas, additional, Kesiraju, Santosh, additional, and Rohdin, Johan, additional
Published: 2018
Full Text: View/download PDF

39. End-to-End DNN Based Speaker Recognition Inspired by I-Vector and PLDA

Author: Rohdin, Johan, primary, Silnova, Anna, additional, Diez, Mireia, additional, Plchot, Oldrch, additional, Matejka, Pavel, additional, and Burget, Lukas, additional
Published: 2018
Full Text: View/download PDF

40. MGB-3 but system: Low-resource ASR on Egyptian YouTube data

Author: Vesely, Karel, primary, Murali, Baskar Karthick, additional, Diez, Mireia, additional, and Benes, Karel, additional
Published: 2017
Full Text: View/download PDF

41. KALAKA-3: a database for the assessment of spoken language recognition technology on YouTube audios

Author: Rodríguez-Fuentes, Luis Javier, primary, Penagarikano, Mikel, additional, Varona, Amparo, additional, Diez, Mireia, additional, and Bordel, Germán, additional
Published: 2015
Full Text: View/download PDF

42. New insight into the use of phone log-likelihood ratios as features for language recognition

Author: Diez, Mireia, primary, Varona, Amparo, additional, Penagarikano, Mikel, additional, Rodriguez-Fuentes, Luis Javier, additional, and Bordel, German, additional
Published: 2014
Full Text: View/download PDF

43. PLLR features in language recognition system for RATS

Author: Plchot, Oldřich, primary, Diez, Mireia, additional, Soufifar, Mehdi, additional, and Burget, Lukáš, additional
Published: 2014
Full Text: View/download PDF

44. On the complementarity of short-time fourier analysis windows of different lengths for improved language recognition

Author: Diez, Mireia, primary, Penagarikano, Mikel, additional, Bordel, German, additional, Varona, Amparo, additional, and Rodriguez-Fuentes, Luis Javier, additional
Published: 2014
Full Text: View/download PDF

45. Optimizing PLLR Features for Spoken Language Recognition

Author: Diez, Mireia, primary, Varona, Amparo, additional, Penagarikano, Mike, additional, Rodriguez-Fuentes, Luis Javier, additional, and Bordel, German, additional
Published: 2014
Full Text: View/download PDF

46. On the Complementarity of Phone Posterior Probabilities for Improved Speaker Recognition

Author: Diez, Mireia, primary, Varona, Amparo, additional, Penagarikano, Mikel, additional, Rodriguez-Fuentes, Luis Javier, additional, and Bordel, German, additional
Published: 2014
Full Text: View/download PDF

47. High-performance Query-by-Example Spoken Term Detection on the SWS 2013 evaluation

Author: Rodriguez-Fuentes, Luis J., primary, Varona, Amparo, additional, Penagarikano, Mikel, additional, Bordel, German, additional, and Diez, Mireia, additional
Published: 2014
Full Text: View/download PDF

48. Using phone log-likelihood ratios as features for speaker recognition

Author: Diez, Mireia, primary, Varona, Amparo, additional, Penagarikano, Mikel, additional, Rodríguez-Fuentes, Luis Javier, additional, and Bordel, Germán, additional
Published: 2013
Full Text: View/download PDF

49. Handling recordings acquired simultaneously over multiple channels with PLDA

Author: Villalba, Jesús, primary, Diez, Mireia, additional, Varona, Amparo, additional, and Lleida, Eduardo, additional
Published: 2013
Full Text: View/download PDF

50. Dimensionality reduction of phone log-likelihood ratio features for spoken language recognition

Author: Diez, Mireia, primary, Varona, Amparo, additional, Penagarikano, Mikel, additional, Rodríguez-Fuentes, Luis Javier, additional, and Bordel, Germán, additional
Published: 2013
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

89 results on '"Diez, Mireia"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources