Author: "Alessio Brutti" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Alessio Brutti"' showing total 155 results

Start Over Author "Alessio Brutti"

155 results on '"Alessio Brutti"'

1. Speaker Anonymization: Disentangling Speaker Features from Pre-Trained Speech Embeddings for Voice Conversion

Author: Marco Matassoni, Seraphina Fong, and Alessio Brutti
Subjects: privacy protection, anonymization, voice conversion, voice cloning, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Abstract: Speech is a crucial source of personal information, and the risk of attackers using such information increases day by day. Speaker privacy protection is crucial, and various approaches have been proposed to hide the speaker’s identity. One approach is voice anonymization, which aims to safeguard speaker identity while maintaining speech content through techniques such as voice conversion or spectral feature alteration. The significance of voice anonymization has grown due to the necessity to protect personal information in applications such as voice assistants, authentication, and customer support. Building upon the S3PRL-VC toolkit and on pre-trained speech and speaker representation models, this paper introduces a feature disentanglement approach to improve the de-identification performance of the state-of-the-art anonymization approaches based on voice conversion. The proposed approach achieves state-of-the-art speaker de-identification and causes minimal impact on the intelligibility of the signal after conversion.
Published: 2024
Full Text: View/download PDF

2. Speaker front‐back disambiguity using multi‐channel speech signals

Author: Xinyuan Qian, Jichen Yang, and Alessio Brutti
Subjects: Electrical engineering. Electronics. Nuclear engineering, TK1-9971
Abstract: Abstract This paper tackles the front‐back disambiguity problem in speaker localization when the audio signals are captured by a symmetric microphone array. To this end, a deep neural network is proposed with an attention‐based mechanism designed to assign different weights to features obtained from individual microphones. For support, a real dataset with synchronized multichannel audio signals captured by a large linear microphone array is introduced, along with manual annotations. The experimental results demonstrate the effectiveness of the proposed method over the other approaches. In particular, more than 50% reduction in Equal Error Rate (EER) is achieved when comparing with the single‐channel case. The designed multi‐channel self‐attention mechanism also brings further improvements. The dataset and source code will be released.
Published: 2022
Full Text: View/download PDF

3. Speech Enhancement Using Dilated Wave-U-Net: an Experimental Analysis

Author: Mohamed Nabih Ali Mohamed Nawar, Alessio Brutti, and Daniele Falavigna
Subjects: speech enhancement, wave-u-net, dilated convolutional neural network, Telecommunication, TK5101-6720
Abstract: Speech enhancement is a relevant component in many real-world applications such as hearing aid, mobile telecommunications and healthcare applications. In this paper, we investigate the Dilated Wave-U-Net model: a recently proposed end-to-end neural speech enhancement approach based on the Wave-U-Net architecture. We evaluate the performance of the model on two datasets: the public VCTK dataset, and a contaminated version of Librispeech. In particular, we experiment on using alternative losses based on L1 norm and on a combination of L1 and MSE losses. Results show that the Dilated Wave-U-Net architecture outperforms other state-of-the-art methods in terms of intelligibility and quality metrics on both datasets and that MSE loss is the most performing.
Published: 2020
Full Text: View/download PDF

4. Time-Domain Joint Training Strategies of Speech Enhancement and Intent Classification Neural Models

Author: Mohamed Nabih Ali, Daniele Falavigna, and Alessio Brutti
Subjects: joint training, speech enhancement, intent classification, Chemical technology, TP1-1185
Abstract: Robustness against background noise and reverberation is essential for many real-world speech-based applications. One way to achieve this robustness is to employ a speech enhancement front-end that, independently of the back-end, removes the environmental perturbations from the target speech signal. However, although the enhancement front-end typically increases the speech quality from an intelligibility perspective, it tends to introduce distortions which deteriorate the performance of subsequent processing modules. In this paper, we investigate strategies for jointly training neural models for both speech enhancement and the back-end, which optimize a combined loss function. In this way, the enhancement front-end is guided by the back-end to provide more effective enhancement. Differently from typical state-of-the-art approaches employing on spectral features or neural embeddings, we operate in the time domain, processing raw waveforms in both components. As application scenario we consider intent classification in noisy environments. In particular, the front-end speech enhancement module is based on Wave-U-Net while the intent classifier is implemented as a temporal convolutional network. Exhaustive experiments are reported on versions of the Fluent Speech Commands corpus contaminated with noises from the Microsoft Scalable Noisy Speech Dataset, shedding light and providing insight about the most promising training approaches.
Published: 2022
Full Text: View/download PDF

5. MOSEL: 950, 000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages.

Author: Marco Gaido, Sara Papi, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih, and Matteo Negri
Published: 2024

6. Detection and Classification of Cardiovascular Diseases Using Neural Networks.

Author: Bastián Estay Zamorano, Ali Dehghan Firoozabadi, Alessio Brutti, Pablo Adasme, David Zabala-Blanco, Pablo Palacios Játiva, and Cesar A. Azurdia-Meza
Published: 2024
Full Text: View/download PDF

7. LDASR: An Experimental Study on Layer Drop Using Conformer-Based Architecture.

Author: Abdul Hannan, Alessio Brutti, and Daniele Falavigna
Published: 2024

8. Training Early-Exit Architectures for Automatic Speech Recognition: Fine-Tuning Pre-Trained Models or Training from Scratch.

Author: George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, Mohamed Nabih Ali, and Alessio Brutti
Published: 2024
Full Text: View/download PDF

9. Continual Contrastive Spoken Language Understanding.

Author: Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, and Bhiksha Raj
Published: 2024
Full Text: View/download PDF

10. Multiple Source Localization Based on Acoustic Map De-Emphasis

Author: Alessio Brutti, Maurizio Omologo, and Piergiorgio Svaizer
Subjects: Acoustics. Sound, QC221-246, Electronic computers. Computer science, QA75.5-76.95
Published: 2010
Full Text: View/download PDF

11. Large Language Models Are Strong Audio-Visual Speech Recognition Learners.

Author: Umberto Cappellazzo, Minsu Kim, Honglie Chen, Pingchuan Ma 0001, Stavros Petridis, Daniele Falavigna, Alessio Brutti, and Maja Pantic
Published: 2024
Full Text: View/download PDF

12. MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages.

Author: Marco Gaido, Sara Papi, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih, and Matteo Negri
Published: 2024
Full Text: View/download PDF

13. Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters.

Author: Umberto Cappellazzo, Daniele Falavigna, and Alessio Brutti
Published: 2024
Full Text: View/download PDF

14. Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous Clients.

Author: Mohamed Nabih Ali, Alessio Brutti, and Daniele Falavigna
Published: 2024
Full Text: View/download PDF

15. Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding.

Author: Umberto Cappellazzo, Muqiao Yang, Daniele Falavigna, and Alessio Brutti
Published: 2023
Full Text: View/download PDF

16. An Investigation of the Combination of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding.

Author: Umberto Cappellazzo, Daniele Falavigna, and Alessio Brutti
Published: 2023
Full Text: View/download PDF

17. End-to-end integration of speech separation and voice activity detection for low-latency diarization of telephone conversations.

Author: Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, and Stefano Squartini
Published: 2024
Full Text: View/download PDF

18. Enhancing Embeddings for Speech Classification in Noisy Conditions.

Author: Mohamed Nabih Ali, Alessio Brutti, and Daniele Falavigna
Published: 2022
Full Text: View/download PDF

19. Using Seq2seq voice conversion with pre-trained representations for audio anonymization: experimental insights.

Author: Marco Costante, Marco Matassoni, and Alessio Brutti
Published: 2022
Full Text: View/download PDF

20. Optimizing PhiNet architectures for the detection of urban sounds on low-end devices.

Author: Alessio Brutti, Francesco Paissan, Alberto Ancilotto, and Elisabetta Farella
Published: 2022

21. Is Cross-Attention Preferable to Self-Attention for Multi-Modal Emotion Recognition?

Author: Vandana Rajan, Alessio Brutti, and Andrea Cavallaro
Published: 2022
Full Text: View/download PDF

22. End-to-End Low Resource Keyword Spotting Through Character Recognition and Beam-Search Re-Scoring.

Author: Ephrem Tibebe Mekonnen, Alessio Brutti, and Daniele Falavigna
Published: 2022
Full Text: View/download PDF

23. Scalable Neural Architectures for End-to-End Environmental Sound Classification.

Author: Francesco Paissan, Alberto Ancilotto, Alessio Brutti, and Elisabetta Farella
Published: 2022
Full Text: View/download PDF

24. Low-Latency Speech Separation Guided Diarization for Telephone Conversations.

Author: Giovanni Morrone, Samuele Cornell, Desh Raj, Luca Serafini, Enrico Zovato, Alessio Brutti, and Stefano Squartini
Published: 2022
Full Text: View/download PDF

25. Towards Speaker-Independent Voice Conversion for Improving Dysarthric Speech Intelligibility.

Author: Seraphina Fong, Marco Matassoni, Gianluca Esposito, and Alessio Brutti
Published: 2023

26. Audio-Visual Tracking of Concurrent Speakers.

Author: Xinyuan Qian, Alessio Brutti, Oswald Lanz, Maurizio Omologo, and Andrea Cavallaro
Published: 2022
Full Text: View/download PDF

27. An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings.

Author: Luca Serafini, Samuele Cornell, Giovanni Morrone, Enrico Zovato, Alessio Brutti, and Stefano Squartini
Published: 2023
Full Text: View/download PDF

28. End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations.

Author: Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, and Stefano Squartini
Published: 2023
Full Text: View/download PDF

29. Improving the Intent Classification accuracy in Noisy Environment.

Author: Mohamed Nabih Ali, Alessio Brutti, and Daniele Falavigna
Published: 2023
Full Text: View/download PDF

30. Scaling strategies for on-device low-complexity source separation with Conv-Tasnet.

Author: Mohamed Nabih Ali, Francesco Paissan, Daniele Falavigna, and Alessio Brutti
Published: 2023
Full Text: View/download PDF

31. Continual Contrastive Spoken Language Understanding.

Author: Umberto Cappellazzo, Enrico Fini, Muqiao Yang, Daniele Falavigna, Alessio Brutti, and Bhiksha Raj
Published: 2023
Full Text: View/download PDF

32. Training dynamic models using early exits for automatic speech recognition on resource-constrained devices.

Author: George August Wright, Umberto Cappellazzo, Salah Zaiem, Desh Raj, Lucas Ondel Yang, Daniele Falavigna, and Alessio Brutti
Published: 2023
Full Text: View/download PDF

33. Learning to Rank Microphones for Distant Speech Recognition.

Author: Samuele Cornell, Alessio Brutti, Marco Matassoni, and Stefano Squartini
Published: 2021
Full Text: View/download PDF

34. A Speech Enhancement Front-End for Intent Classification in Noisy Environments.

Author: Mohamed Nabih Ali, Veronica Juliana Schmalz, Alessio Brutti, and Daniele Falavigna
Published: 2021
Full Text: View/download PDF

35. Robust Latent Representations Via Cross-Modal Translation and Alignment.

Author: Vandana Rajan, Alessio Brutti, and Andrea Cavallaro
Published: 2021
Full Text: View/download PDF

36. Speech Enhancement Using Dilated Wave-U-Net: an Experimental Analysis.

Author: Mohamed Nabih Ali, Alessio Brutti, and Daniele Falavigna
Published: 2020

37. Supervised Online Diarization with Sample Mean Loss for Multi-Domain Data.

Author: Enrico Fini and Alessio Brutti
Published: 2020
Full Text: View/download PDF

38. Low-Complexity Acoustic Scene Classification in DCASE 2022 Challenge.

Author: Irene Martín-Morató, Francesco Paissan, Alberto Ancilotto, Toni Heittola, Annamaria Mesaros, Elisabetta Farella, Alessio Brutti, and Tuomas Virtanen
Published: 2022

39. Neural Network Distillation on IoT Platforms for Sound Event Detection.

Author: Gianmarco Cerutti, Rahul Prasad, Alessio Brutti, and Elisabetta Farella
Published: 2019
Full Text: View/download PDF

40. Accurate Target Annotation in 3D from Multimodal Streams.

Author: Oswald Lanz, Alessio Brutti, Alessio Xompero, Xinyuan Qian, Maurizio Omologo, and Andrea Cavallaro
Published: 2019
Full Text: View/download PDF

41. Compact Recurrent Neural Networks for Acoustic Event Detection on Low-Energy Low-Complexity Platforms.

Author: Gianmarco Cerutti, Rahul Prasad, Alessio Brutti, and Elisabetta Farella
Published: 2020
Full Text: View/download PDF

42. Exploring the Joint Use of Rehearsal and Knowledge Distillation in Continual Learning for Spoken Language Understanding.

Author: Umberto Cappellazzo, Daniele Falavigna, and Alessio Brutti
Published: 2022
Full Text: View/download PDF

43. Direct enhancement of pre-trained speech embeddings for speech processing in noisy conditions.

Author: Mohamed Nabih Ali, Alessio Brutti, and Daniele Falavigna
Published: 2023
Full Text: View/download PDF

44. Automatic Assessment of English CEFR Levels Using BERT Embeddings.

Author: Veronica Juliana Schmalz and Alessio Brutti
Published: 2021

45. 3D Mouth Tracking from a Compact Microphone Array Co-Located with a camera.

Author: Xinyuan Qian, Alessio Xompero, Andrea Cavallaro, Alessio Brutti, Oswald Lanz, and Maurizio Omologo
Published: 2018
Full Text: View/download PDF