Search

Your search keyword '"Wang, Hsin-Min"' showing total 90 results

Search Constraints

Start Over You searched for: Author "Wang, Hsin-Min" Remove constraint Author: "Wang, Hsin-Min" Topic computer science - sound Remove constraint Topic: computer science - sound
90 results on '"Wang, Hsin-Min"'

Search Results

1. Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments with Advanced Post-Processing

2. Channel-Aware Domain-Adaptive Generative Adversarial Network for Robust Speech Recognition

3. Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement

4. A Study on Zero-shot Non-intrusive Speech Assessment using Large Language Models

5. Exploring the Impact of Data Quantity on ASR in Extremely Low-resource Languages

6. The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

7. Effective Noise-aware Data Simulation for Domain-adaptive Speech Enhancement Leveraging Dynamic Stochastic Perturbation

8. SVSNet+: Enhancing Speaker Voice Similarity Assessment Models with Representations from Speech Foundation Models

9. SpeechCLIP+: Self-supervised multi-task representation learning for speech via CLIP and speech-image data

10. HAAQI-Net: A Non-intrusive Neural Music Audio Quality Assessment Model for Hearing Aids

11. Multi-objective Non-intrusive Hearing-aid Speech Assessment Model

12. AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection

13. AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection

14. A Study on Incorporating Whisper for Robust Speech Assessment

15. Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement

16. Non-Intrusive Speech Intelligibility Prediction for Hearing Aids using Whisper and Metadata

17. Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model

18. Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features

19. Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion

20. A Training and Inference Strategy Using Noisy and Enhanced Speech as Target for Speech Enhancement without Clean Speech

21. CasNet: Investigating Channel Robustness for Speech Separation

22. Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN

23. A Study of Using Cepstrogram for Countermeasure Against Replay Attacks

24. MTI-Net: A Multi-Target Speech Intelligibility Prediction Model

25. MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids

26. Filter-based Discriminative Autoencoders for Children Speech Recognition

27. Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

28. Generation of Speaker Representations Using Heterogeneous Training Batch Assembly

29. Multi-target Extractor and Detector for Unknown-number Speaker Diarization

30. Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition

31. Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

32. Chain-based Discriminative Autoencoders for Speech Recognition

33. The VoiceMOS Challenge 2022

34. Partially Fake Audio Detection by Self-attention-based Fake Span Discovery

35. EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement

36. HASA-net: A non-intrusive hearing-aid speech assessment network

37. Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features

38. Speech Enhancement-assisted Voice Conversion in Noisy Environments

39. Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion

40. SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours

41. SVSNet: An End-to-end Speaker Voice Similarity Assessment Model

42. A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion

43. The AS-NU System for the M2VoC Challenge

44. Speech Recognition by Simply Fine-tuning BERT

45. Speech Enhancement with Zero-Shot Model Selection

46. STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model

47. Melody Harmonization Using Orderless NADE, Chord Balancing, and Blocked Gibbs Sampling

48. The Academia Sinica Systems of Voice Conversion for VCC2020

49. Lite Audio-Visual Speech Enhancement

50. WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement

Catalog

Books, media, physical & digital resources