468 results on '"Shiliang Zhang"'
Search Results
52. MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for speech recognition.
53. Accurate and Reliable Confidence Estimation Based on Non-Autoregressive End-to-End Speech Recognition System.
54. BAT: Boundary aware transducer for memory-efficient and low-latency ASR.
55. Evolved Part Masking for Self-Supervised Learning.
56. Sa-Paraformer: Non-Autoregressive End-To-End Speaker-Attributed ASR.
57. The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.
58. Recognizing High-Speed Moving Objects with Spike Camera.
59. TOLD: a Novel Two-Stage Overlap-Aware Framework for Speaker Diarization.
60. Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition.
61. A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.
62. HumVis: Human-Centric Visual Analysis System.
63. Speaker Overlap-aware Neural Diarization for Multi-party Meeting Analysis.
64. A Comparative Study on Speaker-attributed Automatic Speech Recognition in Multi-party Meetings.
65. Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition.
66. Contextual Instance Decoupling for Robust Multi-Person Pose Estimation.
67. SpikingSIM: A Bio-Inspired Spiking Simulator.
68. MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction.
69. M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge.
70. Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
71. Modeling The Detection Capability Of High-Speed Spiking Cameras.
72. Prosospeech: Enhancing Prosody with Quantized Vector Pre-Training in Text-To-Speech.
73. Transformer-Based Domain Adaptation for Event Data Classification.
74. MFCCA:Multi-Frame Cross-Channel Attention for Multi-Speaker ASR in Multi-Party Meeting Scenario.
75. Asymmetric Label Propagation for Video Object Segmentation.
76. Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR.
77. Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation.
78. Unleashing the Full Potential of Product Quantization for Large-Scale Image Retrieval.
79. Robust Pose Estimation in Crowded Scenes with Direct Pose-Level Inference.
80. Extremely Low Footprint End-to-End ASR System for Smart Device.
81. Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings.
82. Intra-Inter Camera Similarity for Unsupervised Person Re-Identification.
83. An Energy Consumption Model for Electrical Vehicle Networks via Extended Federated-learning.
84. Graph Consistency Based Mean-Teaching for Unsupervised Domain Adaptive Person Re-Identification.
85. Hybrid Network Compression via Meta-Learning.
86. Simplified Self-Attention for Transformer-Based end-to-end Speech Recognition.
87. Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.
88. SAN-M: Memory Equipped Self-Attention for End-to-End Speech Recognition.
89. Neural Zero-Inflated Quality Estimation Model for Automatic Speech Recognition System.
90. Streaming Chunk-Aware Multihead Attention for Online End-to-End Speech Recognition.
91. Unsupervised Person Re-Identification via Multi-Label Classification.
92. Robust Partial Matching for Person Search in the Wild.
93. Domain Adaptive Person Re-Identification via Coupling Optimization.
94. Pan: Phoneme-Aware Network for Monaural Speech Enhancement.
95. Joint Visual and Temporal Consistency for Unsupervised Domain Adaptive Person Re-identification.
96. Global-Local Temporal Representations for Video Person Re-Identification.
97. Self-Guided Hash Coding for Large-Scale Person Re-identification.
98. EAGER: Edge-Aided imaGe undERstanding System.
99. Towards Language-Universal Mandarin-English Speech Recognition.
100. Investigation of Transformer Based Spelling Correction Model for CTC-Based End-to-End Mandarin Speech Recognition.
Catalog
Books, media, physical & digital resources
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.