Author: "Tirry, Wouter" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Tirry, Wouter"' showing total 22 results

Start Over Author "Tirry, Wouter"

22 results on '"Tirry, Wouter"'

1. Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder

Author: Shi, Renzheng, Bär, Andreas, Sach, Marvin, Tirry, Wouter, and Fingscheidt, Tim
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Recently, BigVGAN has emerged as high-performance speech vocoder. Its sequence-to-sequence-based synthesis, however, prohibits usage in low-latency conversational applications. Our work addresses this shortcoming in three steps. First, we introduce low latency into BigVGAN via implementing causal convolutions, yielding decreased performance. Second, to regain performance, we propose a teacher-student transfer learning scheme to distill the high-delay non-causal BigVGAN into our low-latency causal vocoder. Third, taking advantage of a self-supervised learning (SSL) model, in our case wav2vec 2.0, we align its encoder speech representations extracted from our low-latency causal vocoder to the ground truth ones. In speaker-independent settings, both proposed training schemes notably elevate the performance of our low-latency vocoder, closing up to the original high-delay BigVGAN. At only 21% higher complexity, our best small causal vocoder achieves 3.96 PESQ and 1.25 MCD, excelling even the original small non-causal BigVGAN (3.64 PESQ) by 0.32 PESQ and 0.1 MCD points, respectively., Comment: Accepted at IWAENC 2024
Published: 2024

2. EffCRN: An Efficient Convolutional Recurrent Network for High-Performance Speech Enhancement

Author: Sach, Marvin, Franzen, Jan, Defraene, Bruno, Fluyt, Kristoff, Strake, Maximilian, Tirry, Wouter, and Fingscheidt, Tim
Subjects: Electrical Engineering and Systems Science - Audio and Speech Processing
Abstract: Fully convolutional recurrent neural networks (FCRNs) have shown state-of-the-art performance in single-channel speech enhancement. However, the number of parameters and the FLOPs/second of the original FCRN are restrictively high. A further important class of efficient networks is the CRUSE topology, serving as reference in our work. By applying a number of topological changes at once, we propose both an efficient FCRN (FCRN15), and a new family of efficient convolutional recurrent neural networks (EffCRN23, EffCRN23lite). We show that our FCRN15 (875K parameters) and EffCRN23lite (396K) outperform the already efficient CRUSE5 (85M) and CRUSE4 (7.2M) networks, respectively, w.r.t. PESQ, DNSMOS and DeltaSNR, while requiring about 94% less parameters and about 20% less #FLOPs/frame. Thereby, according to these metrics, the FCRN/EffCRN class of networks provides new best-in-class network topologies for speech enhancement., Comment: 5 pages, 5 figures, accepted for Interspeech 2023
Published: 2023

3. Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration

Author: Strake, Maximilian, Defraene, Bruno, Fluyt, Kristoff, Tirry, Wouter, and Fingscheidt, Tim
Published: 2020
Full Text: View/download PDF

4. EffCRN: An Efficient Convolutional Recurrent Network for High-Performance Speech Enhancement

Author: Sach, Marvin, primary, Franzen, Jan, additional, Defraene, Bruno, additional, Fluyt, Kristoff, additional, Strake, Maximilian, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2023
Full Text: View/download PDF

5. Spatially Selective Speaker Separation Using a DNN With a Location Dependent Feature Extraction

Author: Bohlender, Alexander, Spriet, Ann, Tirry, Wouter, and Madhu, Nilesh
Abstract: Deep neural networks (DNNs) have proven themselves as an effective means to separate clean speech from noisy mixtures. When there are multiple concurrent talkers, however, unambiguously defining the target output is not trivial, especially if the mixture is single-channel and the talkers are not known in advance. Although this problem can be addressed with permutation invariant training or deep clustering, the performance still suffers in this case. Approaches for compact arrays of multiple microphones can exploit spatial diversity to resolve the ambiguity: a separate output may be generated for each direction of arrival (DOA), or the speaker assignment can be controlled with a location-based training (LBT). Alternatively, we can narrow down the target definition at the input, to perform a spatially selective speaker separation instead of separating all speakers simultaneously. This is achieved by specifying freely adjustable target DOAs. On the one hand, these can be integrated as location-based input features (LBI). On the other hand, the main contribution of this work is a location dependent feature extraction (LDE): we implicitly introduce a DOA dependence in a small part of the DNN by optimizing its parameters for each DOA separately. Experiments demonstrate that LDE outperforms LBT and LBI in terms of instrumental metrics and speech recognition results. A representative audio example is presented for a qualitative impression. An analysis of the spatial selectivity reveals that target and nontarget directions can be distinguished quite well with LDE, which is also verified by recordings of real moving talkers.
Published: 2024
Full Text: View/download PDF

6. Neural Networks Using Full-Band and Subband Spatial Features for Mask Based Source Separation

Author: Bohlender, Alexander, primary, Spriet, Ann, additional, Tirry, Wouter, additional, and Madhu, Nilesh, additional
Published: 2021
Full Text: View/download PDF

7. Exploiting Temporal Context in CNN Based Multisource DOA Estimation

Author: Bohlender, Alexander, primary, Spriet, Ann, additional, Tirry, Wouter, additional, and Madhu, Nilesh, additional
Published: 2021
Full Text: View/download PDF

8. INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising

Author: Strake, Maximilian, primary, Defraene, Bruno, additional, Fluyt, Kristoff, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2020
Full Text: View/download PDF

9. Fully Convolutional Recurrent Networks for Speech Enhancement

Author: Strake, Maximilian, primary, Defraene, Bruno, additional, Fluyt, Kristoff, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2020
Full Text: View/download PDF

10. Least-Squares DOA Estimation with an Informed Phase Unwrapping and Full Bandwidth Robustness

Author: Bohlender, Alexander, primary, Spriet, Ann, additional, Tirry, Wouter, additional, and Madhu, Nilesh, additional
Published: 2020
Full Text: View/download PDF

11. Separated Noise Suppression and Speech Restoration: Lstm-Based Speech Enhancement in Two Stages

Author: Strake, Maximilian, primary, Defraene, Bruno, additional, Fluyt, Kristoff, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2019
Full Text: View/download PDF

12. Comparative Analysis of Generalized Sidelobe Cancellation and Multi-Channel Linear Prediction for Speech Dereverberation and Noise Reduction

Author: Dietzen, Thomas, primary, Spriet, Ann, additional, Tirry, Wouter, additional, Doclo, Simon, additional, Moonen, Marc, additional, and van Waterschoot, Toon, additional
Published: 2019
Full Text: View/download PDF

13. DNN-Supported Speech Enhancement With Cepstral Estimation of Both Excitation and Envelope

Author: Elshamy, Samy, primary, Madhu, Nilesh, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2018
Full Text: View/download PDF

14. A Priori SNR Computation for Speech Enhancement Based on Cepstral Envelope Estimation

Author: Eishamy, Samy, primary, Madhu, Nilesh, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2018
Full Text: View/download PDF

15. Low-Complexity Kalman filter for multi-channel linear-prediction-based blind speech dereverberation

Author: Dietzen, Thomas, primary, Doclo, Simon, additional, Spriet, Ann, additional, Tirry, Wouter, additional, Moonen, Marc, additional, and van Waterschoot, Toon, additional
Published: 2017
Full Text: View/download PDF

16. Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation

Author: Elshamy, Samy, primary, Madhu, Nilesh, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2017
Full Text: View/download PDF

17. An Instrumental Quality Measure for Artificially Bandwidth-Extended Speech Signals

Author: Abel, Johannes, primary, Kaniewska, Magdalena, additional, Guillaume, Cyril, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2017
Full Text: View/download PDF

18. Two-stage speech enhancement with manipulation of the cepstral excitation

Author: Elshamy, Samy, primary, Madhu, Nilesh, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2017
Full Text: View/download PDF

19. A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean

Author: Abel, Johannes, primary, Kaniewska, Magdalena, additional, Guillaume, Cyril, additional, Tirry, Wouter, additional, Pulakka, Hannu, additional, Myllyla, Ville, additional, Sjoberg, Jari, additional, Alku, Paavo, additional, Katsir, Itai, additional, Malah, David, additional, Cohen, Israel, additional, Tugtekin Turan, M. A., additional, Erzin, Engin, additional, Schlien, Thomas, additional, Vary, Peter, additional, Nour-Eldin, AmrH., additional, Kabal, Peter, additional, and Fingscheidt, Tim, additional
Published: 2016
Full Text: View/download PDF

20. An iterative speech model-based a priori SNR estimator

Author: Elshamy, Samy, primary, Madhu, Nilesh, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2015
Full Text: View/download PDF

21. On speech quality assessment of artificial bandwidth extension

Author: Bauer, Patrick, primary, Guillaumea, Cyril, additional, Tirry, Wouter, additional, and Fingscheidt, Tim, additional
Published: 2014
Full Text: View/download PDF

22. Acoustic Zooming by Multimicrophone Sound Scene Manipulation.

Author: VAN WATERSCHOOT, TOON, TIRRY, WOUTER JOOS, and MOONEN, MARC
Subjects: ACOUSTICAL engineering, DIGITAL cameras, DIGITAL video recording, BRAIN imaging, NOISE control, MICROPHONES
Abstract: Zoom control is a key feature of audiovisual capture in both professional and consumer cameras. While a video zoom operation is often not complemented with a corresponding acoustic zoom, psychophysical as well as neuroimaging results suggest that a cross-modal approach to zooming may facilitate multisensory integration. As auditory distance perception is primarily determined by sound intensity, an audiovisual zoom effect may be obtained by matching the levels of different sources in a sound scene with their visually perceived motion during video zooming. In this paper, we propose a general theory for independent sound source level control which can be exploited to attain an acoustic zoom effect. An essential feature of the proposed theory is that it does not consist in an explicit sound source separation, which relieves its potential computational requirements. An efficient implementation using fixed and adaptive spatial and spectral noise reduction algorithms is proposed and evaluated. Experimental results using an array with a small number of low-cost microphones confirm that the proposed approach is particularly suited for consumer audiovisual capture applications. [ABSTRACT FROM AUTHOR]
Published: 2013

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

22 results on '"Tirry, Wouter"'

1. Non-Causal to Causal SSL-Supported Transfer Learning: Towards a High-Performance Low-Latency Speech Vocoder

2. EffCRN: An Efficient Convolutional Recurrent Network for High-Performance Speech Enhancement

3. Speech enhancement by LSTM-based noise suppression followed by CNN-based speech restoration

4. EffCRN: An Efficient Convolutional Recurrent Network for High-Performance Speech Enhancement

5. Spatially Selective Speaker Separation Using a DNN With a Location Dependent Feature Extraction

6. Neural Networks Using Full-Band and Subband Spatial Features for Mask Based Source Separation

7. Exploiting Temporal Context in CNN Based Multisource DOA Estimation

8. INTERSPEECH 2020 Deep Noise Suppression Challenge: A Fully Convolutional Recurrent Network (FCRN) for Joint Dereverberation and Denoising

9. Fully Convolutional Recurrent Networks for Speech Enhancement

10. Least-Squares DOA Estimation with an Informed Phase Unwrapping and Full Bandwidth Robustness

11. Separated Noise Suppression and Speech Restoration: Lstm-Based Speech Enhancement in Two Stages

12. Comparative Analysis of Generalized Sidelobe Cancellation and Multi-Channel Linear Prediction for Speech Dereverberation and Noise Reduction

13. DNN-Supported Speech Enhancement With Cepstral Estimation of Both Excitation and Envelope

14. A Priori SNR Computation for Speech Enhancement Based on Cepstral Envelope Estimation

15. Low-Complexity Kalman filter for multi-channel linear-prediction-based blind speech dereverberation

16. Instantaneous A Priori SNR Estimation by Cepstral Excitation Manipulation

17. An Instrumental Quality Measure for Artificially Bandwidth-Extended Speech Signals

18. Two-stage speech enhancement with manipulation of the cepstral excitation

19. A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean

20. An iterative speech model-based a priori SNR estimator

21. On speech quality assessment of artificial bandwidth extension

22. Acoustic Zooming by Multimicrophone Sound Scene Manipulation.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

22 results on '"Tirry, Wouter"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources