Search

Your search keyword '"Yang, Dongchao"' showing total 185 results

Search Constraints

Start Over You searched for: Author "Yang, Dongchao" Remove constraint Author: "Yang, Dongchao"
185 results on '"Yang, Dongchao"'

Search Results

1. SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis

2. SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models

3. UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner

4. CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction

5. Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder

6. SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models

7. RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis

8. NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models

12. Consistent and Relevant: Rethink the Query Embedding in General Sound Separation

13. DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction

14. UniAudio: An Audio Foundation Model Toward Universal Audio Generation

15. PromptTTS 2: Describing and Generating Voices with Text Prompt

16. NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

17. Make-A-Voice: Unified Voice Synthesis With Discrete Representation

18. Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation

19. HiFi-Codec: Group-residual Vector quantization for High Fidelity Audio Codec

20. AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

22. Improving Text-Audio Retrieval by Text-aware Attention Pooling and Prior Matrix Revised Loss

23. Improving Weakly Supervised Sound Event Detection with Causal Intervention

24. InstructTTS: Modelling Expressive TTS in Discrete Latent Space with Natural Language Style Prompt

25. Make-An-Audio: Text-To-Audio Generation with Prompt-Enhanced Diffusion Models

26. NoreSpeech: Knowledge Distillation based Conditional Diffusion Model for Noise-robust Expressive TTS

27. Diffsound: Discrete Diffusion Model for Text-to-sound Generation

30. Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction

31. A Mixed supervised Learning Framework for Target Sound Detection

32. RaDur: A Reference-aware and Duration-robust Network for Target Sound Detection

33. Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches

34. Improving Target Sound Extraction with Timestamp Information

35. Physical Nature of Magnon Spin Seebeck Effect in Ferrimagnetic Insulators

37. Detect what you want: Target Sound Detection

38. Improving the Performance of Automated Audio Captioning via Integrating the Acoustic and Semantic Information

39. A Mutual learning framework for Few-shot Sound Event Detection

41. Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification

42. Omnidirectional Motion Control Method of Quadruped Robot Based on 3D-CPG Oscillator Group

43. Towards Data Distillation for End-to-end Spoken Conversational Question Answering

48. AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head

Catalog

Books, media, physical & digital resources