Search

Your search keyword '"Wang, Yujun"' showing total 38 results

Search Constraints

Start Over You searched for: Author "Wang, Yujun" Remove constraint Author: "Wang, Yujun" Topic electrical engineering and systems science - audio and speech processing Remove constraint Topic: electrical engineering and systems science - audio and speech processing
38 results on '"Wang, Yujun"'

Search Results

1. Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models

2. Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding

3. Bridging Language Gaps in Audio-Text Retrieval

4. Scaling up masked audio encoder learning for general audio classification

5. Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling

6. CED: Consistent ensemble distillation for audio tagging

7. Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction

8. Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information

9. AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction

10. Understanding temporally weakly supervised training: A case study for keyword spotting

11. Streaming Audio Transformers for Online Audio Tagging

12. Exploring Representation Learning for Small-Footprint Keyword Spotting

13. Relate auditory speech to EEG by shallow-deep attention-based network

14. Improving Weakly Supervised Sound Event Detection with Causal Intervention

15. Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers

16. Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

17. An empirical study of weakly supervised audio tagging embeddings for general audio representations

18. UniKW-AT: Unified Keyword Spotting and Audio Tagging

19. Pseudo strong labels for large scale weakly supervised audio tagging

20. Learning Decoupling Features Through Orthogonality Regularization

21. Detect what you want: Target Sound Detection

22. Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation

23. A Separable Temporal Convolution Neural Network with Attention for Small-Footprint Keyword Spotting

24. Separable Temporal Convolution plus Temporally Pooled Attention for Lightweight High-performance Keyword Spotting

25. Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic Model

26. Msdtron: a high-capability multi-speaker speech synthesis system for diverse data using characteristic information

27. GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

28. speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

29. Multi-Channel Automatic Speech Recognition Using Deep Complex Unet

30. Data Augmentation For Children's Speech Recognition -- The 'Ethiopian' System For The SLT 2021 Children Speech Recognition Challenge

31. AutoKWS: Keyword Spotting with Differentiable Architecture Search

32. Exploiting Deep Sentential Context for Expressive End-to-End Speech Synthesis

33. RawNet: Fast End-to-End Neural Vocoder

34. End-to-end Models with auditory attention in Multi-channel Keyword Spotting

35. Sequence-to-sequence Models for Small-Footprint Keyword Spotting

36. Attention-based End-to-End Models for Small-Footprint Keyword Spotting

37. Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

38. Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition

Catalog

Books, media, physical & digital resources