Search

Your search keyword '"Wang, Hsin-Min"' showing total 146 results

Search Constraints

Start Over You searched for: Author "Wang, Hsin-Min" Remove constraint Author: "Wang, Hsin-Min" Publication Year Range Last 3 years Remove constraint Publication Year Range: Last 3 years
146 results on '"Wang, Hsin-Min"'

Search Results

2. Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing

3. Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech Recognition

4. Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement

5. A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models

6. Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages

7. The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

8. Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

9. SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

10. Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

11. SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

12. HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids

13. Multi-objective Non-intrusive Hearing-aid Speech Assessment Model

14. AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection

15. AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection

16. The VoiceMOS Challenge 2023: Zero-shot Subjective Speech Quality Prediction for Multiple Domains

17. A Study on Incorporating Whisper for Robust Speech Assessment

18. Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement

19. Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata

20. Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model

21. Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features

22. Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion

23. BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm

24. A Training and Inference Strategy Using Noisy and Enhanced Speech as Target for Speech Enhancement without Clean Speech

25. CasNet: Investigating Channel Robustness for Speech Separation

26. Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN

27. NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling

28. A Study of Using Cepstrogram for Countermeasure Against Replay Attacks

29. MTI-Net: A Multi-Target Speech Intelligibility Prediction Model

30. MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids

31. Filter-based Discriminative Autoencoders for Children Speech Recognition

32. Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

33. Generation of Speaker Representations Using Heterogeneous Training Batch Assembly

34. Multi-target Extractor and Detector for Unknown-number Speaker Diarization

35. Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition

36. Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

37. Chain-based Discriminative Autoencoders for Speech Recognition

38. The VoiceMOS Challenge 2022

39. Partially Fake Audio Detection by Self-attention-based Fake Span Discovery

40. EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement

41. HASA-net: A non-intrusive hearing-aid speech assessment network

42. Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features

44. Multi-Modal Pedestrian Crossing Intention Prediction with Transformer-Based Model.

45. End-to-End Singing Transcription Based on CTC and HSMM Decoding with a Refined Score Representation.

46. Estimating 3D Hand Poses and Shapes from Silhouettes.

47. Meta Soft Prompting and Learning.

48. A Lightweight Enhancement Approach for Real-Time Semantic Segmentation by Distilling Rich Knowledge from Pre-Trained Vision-Language Model.

Catalog

Books, media, physical & digital resources