Search

Your search keyword '"Wang, Yujun"' showing total 801 results

Search Constraints

Start Over You searched for: Author "Wang, Yujun" Remove constraint Author: "Wang, Yujun" Search Limiters Full Text Remove constraint Search Limiters: Full Text
801 results on '"Wang, Yujun"'

Search Results

1. Efficient Extraction of Noise-Robust Discrete Units from Self-Supervised Speech Models

2. Enhancing Automated Audio Captioning via Large Language Models with Optimized Audio Encoding

3. Bridging Language Gaps in Audio-Text Retrieval

4. Scaling up masked audio encoder learning for general audio classification

5. Towards Expressive Zero-Shot Speech Synthesis with Hierarchical Prosody Modeling

6. CED: Consistent ensemble distillation for audio tagging

7. Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction

8. Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information

9. AV-SepFormer: Cross-Attention SepFormer for Audio-Visual Target Speaker Extraction

10. Understanding temporally weakly supervised training: A case study for keyword spotting

11. Streaming Audio Transformers for Online Audio Tagging

15. Continuous and low-carbon production of biomass flash graphene

19. Exploring Representation Learning for Small-Footprint Keyword Spotting

20. Relate auditory speech to EEG by shallow-deep attention-based network

21. Improving Weakly Supervised Sound Event Detection with Causal Intervention

22. Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers

23. Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

24. An empirical study of weakly supervised audio tagging embeddings for general audio representations

25. UniKW-AT: Unified Keyword Spotting and Audio Tagging

26. Pseudo strong labels for large scale weakly supervised audio tagging

27. Learning Decoupling Features Through Orthogonality Regularization

31. Detect what you want: Target Sound Detection

32. Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation

33. PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control

38. A Separable Temporal Convolution Neural Network with Attention for Small-Footprint Keyword Spotting

39. Separable Temporal Convolution plus Temporally Pooled Attention for Lightweight High-performance Keyword Spotting

40. Multi-channel Speech Enhancement with 2-D Convolutional Time-frequency Domain Features and a Pre-trained Acoustic Model

41. Msdtron: a high-capability multi-speaker speech synthesis system for diverse data using characteristic information

Catalog

Books, media, physical & digital resources