367 results on '"Daniel Povey"'
Search Results
2. Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS.
3. Less Peaky and More Accurate CTC Forced Alignment by Label Priors.
4. PromptASR for Contextualized ASR with Controllable Style.
5. Libriheavy: A 50, 000 Hours ASR Corpus with Punctuation Casing and Context.
6. On Speaker Attribution with SURT.
7. Zipformer: A faster and better encoder for automatic speech recognition.
8. LibriheavyMix: A 20,000-Hour Dataset for Single-Channel Reverberant Multi-Talker Speech Separation, ASR and Speaker Diarization.
9. Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation.
10. GPU-accelerated Guided Source Separation for Meeting Transcription.
11. Blank-regularized CTC for Frame Skipping in Neural Transducer.
12. Delay-penalized CTC Implemented Based on Finite State Transducer.
13. Bypass Temporal Classification: Weakly Supervised Automatic Speech Recognition with Imperfect Transcripts.
14. Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition.
15. Fast and Parallel Decoding for Transducer.
16. Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation.
17. Delay-Penalized Transducer for Low-Latency Streaming ASR.
18. Building Keyword Search System from End-To-End Asr Systems.
19. Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition.
20. SURT 2.0: Advances in Transducer-Based Multi-Talker Speech Recognition.
21. Pruned RNN-T for fast, memory-efficient ASR training.
22. Zipformer: A faster and better encoder for automatic speech recognition.
23. PromptASR for contextualized ASR with controllable style.
24. Libriheavy: a 50, 000 hours ASR corpus with punctuation casing and context.
25. Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS.
26. speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment.
27. GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio.
28. Wake Word Detection with Streaming Transformers.
29. An Asynchronous WFST-Based Decoder for Automatic Speech Recognition.
30. A Parallelizable Lattice Rescoring Strategy with Neural Language Models.
31. DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs.
32. Neural Language Modeling with Implicit Cache Pointers.
33. PyChain: A Fully Parallelized PyTorch Implementation of LF-MMI for End-to-End ASR.
34. An Alternative to MFCCs for ASR.
35. Lattice-Free Maximum Mutual Information Training of Multilingual Speech Recognition Systems.
36. Wake Word Detection with Alignment-Free Lattice-Free MMI.
37. Efficient MDI Adaptation for n-Gram Language Models.
38. OOV Recovery with Efficient 2nd Pass Decoding and Open-vocabulary Word-level RNNLM Rescoring for Hybrid ASR.
39. Speaker Diarization with Region Proposal Network.
40. Gpu-Accelerated Viterbi Exact Lattice Decoder for Batched Online and Offline Speech Recognition.
41. An Empirical Study of Transformer-Based Neural Language Model Adaptation.
42. LET-Decoder: A WFST-Based Lazy-Evaluation Token-Group Decoder With Exact Lattice Generation.
43. Improving Emotion Identification Using Phone Posteriors in Raw Speech Waveform Based DNN.
44. State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18.
45. The JHU ASR System for VOiCES from a Distance Challenge 2019.
46. The JHU Speaker Recognition System for the VOiCES 2019 Challenge.
47. Advances in Automatic Speech Recognition for Child Speech Using Factored Time Delay Neural Network.
48. Multi-PLDA Diarization on Children's Speech.
49. x-Vector DNN Refinement with Full-Length Recordings for Speaker Recognition.
50. Speaker Recognition Benchmark Using the CHiME-5 Corpus.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.