Search

Your search keyword '"Yan, Zhijie"' showing total 22 results

Search Constraints

Start Over You searched for: Author "Yan, Zhijie" Remove constraint Author: "Yan, Zhijie" Topic computer science - sound Remove constraint Topic: computer science - sound
22 results on '"Yan, Zhijie"'

Search Results

1. Are Transformers in Pre-trained LM A Good ASR Encoder? An Empirical Study

2. CosyVoice: A Scalable Multilingual Zero-shot Text-to-speech Synthesizer based on Supervised Semantic Tokens

3. FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs

4. Advancing VAD Systems Based on Multi-Task Learning with Improved Model Structures

5. LauraGPT: Listen, Attend, Understand, and Regenerate Audio with GPT

6. The second multi-channel multi-party meeting transcription challenge (M2MeT) 2.0): A benchmark for speaker-attributed ASR

7. Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System

8. Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG)

9. Achieving Timestamp Prediction While Recognizing with Non-Autoregressive End-to-End ASR Model

10. MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition

11. Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis

12. Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition

13. Speaker Embedding-aware Neural Diarization: an Efficient Framework for Overlapping Speech Diarization in Meeting Scenarios

14. ProsoSpeech: Enhancing Prosody With Quantized Vector Pre-training in Text-to-Speech

15. Summary On The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge

16. M2MeT: The ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge

17. BeamTransformer: Microphone Array-based Overlapping Speech Detection

18. A Real-time Speaker Diarization System Based on Spatial Spectrum

19. Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition

20. Neural Zero-Inflated Quality Estimation Model For Automatic Speech Recognition System

21. Automatic Spelling Correction with Transformer for CTC-based End-to-End Speech Recognition

22. Linear networks based speaker adaptation for speech synthesis

Catalog

Books, media, physical & digital resources