Search

Your search keyword '"Chung, Joon Son"' showing total 265 results

Search Constraints

Start Over You searched for: Author "Chung, Joon Son" Remove constraint Author: "Chung, Joon Son"
265 results on '"Chung, Joon Son"'

Search Results

1. AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

2. Accelerating Codec-based Speech Synthesis with Multi-Token Prediction and Speculative Decoding

3. Let Me Finish My Sentence: Video Temporal Grounding with Holistic Text Understanding

4. SpoofCeleb: Speech Deepfake Detection and SASV In The Wild

5. Text-To-Speech Synthesis In The Wild

6. The VoxCeleb Speaker Recognition Challenge: A Retrospective

7. Bridging the Gap between Audio and Text using Parallel-attention for User-defined Keyword Spotting

8. VoxSim: A perceptual voice similarity dataset

9. Aligning Sight and Sound: Advanced Sound Source Localization Through Audio-Visual Alignment

10. ElasticAST: An Audio Spectrogram Transformer for All Length and Resolutions

11. Disentangled Representation Learning for Environment-agnostic Speaker Recognition

12. Lightweight Audio Segmentation for Long-form Speech Translation

13. FlowAVSE: Efficient Audio-Visual Speech Enhancement with Conditional Flow Matching

14. To what extent can ASV systems naturally defend against spoofing attacks?

15. Audio Mamba: Bidirectional State Space Model for Audio Representation Learning

16. Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

17. Towards Automated Movie Trailer Generation

18. Scaling Up Video Summarization Pretraining with Large Language Models

19. EquiAV: Leveraging Equivariance for Audio-Visual Contrastive Learning

20. Can CLIP Help Sound Source Localization?

21. Seeing Through the Conversation: Audio-Visual Speech Separation based on Diffusion Model

22. Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification

23. VoiceLDM: Text-to-Speech with Environmental Context

24. TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning

25. SlowFast Network for Continuous Sign Language Recognition

26. Sound Source Localization is All about Cross-Modal Alignment

27. Let There Be Sound: Reconstructing High Quality Speech from Silent Videos

28. FlexiAST: Flexibility is What AST Needs

29. That's What I Said: Fully-Controllable Talking Face Generation

30. Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples

31. Self-Sufficient Framework for Continuous Sign Language Recognition

32. Imaginary Voice: Face-styled Diffusion Model for Text-to-Speech

33. VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge

34. MarginNCE: Robust Sound Localization with a Negative Margin

35. Signing Outside the Studio: Benchmarking Background Robustness for Continuous Sign Language Recognition

36. Metric Learning for User-defined Keyword Spotting

37. Disentangled representation learning for multilingual speaker recognition

38. In search of strong embedding extractors for speaker diarisation

39. Large-scale learning of generalised representations for speaker recognition

40. Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion

41. Curriculum learning for self-supervised speaker verification

42. Pushing the limits of raw waveform speaker recognition

43. VoxSRC 2021: The Third VoxCeleb Speaker Recognition Challenge

44. Advancing the dimensionality reduction of speaker embeddings for speaker diarisation: disentangling noise and informing speech activity

45. Multi-scale speaker embedding-based graph attention networks for speaker diarisation

46. Spell my name: keyword boosted speech recognition

47. AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

48. Look Who's Talking: Active Speaker Detection in the Wild

49. Adapting Speaker Embeddings for Speaker Diarisation

50. Three-class Overlapped Speech Detection using a Convolutional Recurrent Neural Network

Catalog

Books, media, physical & digital resources