Search

Your search keyword '"Wu, Xixin"' showing total 252 results

Search Constraints

Start Over You searched for: Author "Wu, Xixin" Remove constraint Author: "Wu, Xixin"
252 results on '"Wu, Xixin"'

Search Results

1. Leveraging Chain of Thought towards Empathetic Spoken Dialogue without Corresponding Question-Answering Data

2. Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives

3. DrawSpeech: Expressive Speech Synthesis Using Prosodic Sketches as Control Conditions

4. Disambiguation of Chinese Polyphones in an End-to-End Framework with Semantic Features Extracted by Pre-trained BERT

5. learning discriminative features from spectrograms using center loss for speech emotion recognition

6. Ontology-grounded Automatic Knowledge Graph Construction by LLM under Wikidata schema

7. Not All Errors Are Equal: Investigation of Speech Recognition Errors in Alzheimer's Disease Detection

8. Devising a Set of Compact and Explainable Spoken Language Feature for Screening Alzheimer's Disease

9. Improving Grapheme-to-Phoneme Conversion through In-Context Knowledge Retrieval with Large Language Models

10. Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains

11. A Survey on the Honesty of Large Language Models

12. Towards Within-Class Variation in Alzheimer's Disease Detection from Spontaneous Speech

13. AudioComposer: Towards Fine-grained Audio Generation with Natural Language Descriptions

14. Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC

15. Speaking from Coarse to Fine: Improving Neural Codec Language Model via Multi-Scale Speech Coding and Generation

16. Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions

17. SoCodec: A Semantic-Ordered Multi-Stream Speech Codec for Efficient Language Model Based Text-to-Speech Synthesis

18. SimpleSpeech 2: Towards Simple and Efficient Text-to-Speech with Flow-based Scalar Latent Transformer Diffusion Models

19. Spontaneous Style Text-to-Speech Synthesis with Controllable Spontaneous Behaviors Based on Language Models

20. Large Language Model-based FMRI Encoding of Language Functions for Subjects with Neurocognitive Disorder

21. Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System

22. Autoregressive Speech Synthesis without Vector Quantization

23. Purple-teaming LLMs with Adversarial Defender Training

24. Seamless Language Expansion: Enhancing Multilingual Mastery in Self-Supervised Models

25. Adaptive Query Rewriting: Aligning Rewriters through Marginal Probability of Conversational Answers

26. UniAudio 1.5: Large Language Model-driven Audio Codec is A Few-shot Audio Task Learner

27. CoLM-DSR: Leveraging Neural Codec Language Modeling for Multi-Modal Dysarthric Speech Reconstruction

28. Addressing Index Collapse of Large-Codebook Speech Tokenizer with Dual-Decoding Product-Quantized Variational Auto-Encoder

29. SimpleSpeech: Towards Simple and Efficient Text-to-Speech with Scalar Latent Transformer Diffusion Models

30. Target Speech Extraction with Pre-trained AV-HuBERT and Mask-And-Recover Strategy

31. CLAPSep: Leveraging Contrastive Pre-trained Model for Multi-Modal Query-Conditioned Target Sound Extraction

32. Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction

33. UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization

34. Cross-Speaker Encoding Network for Multi-Talker Speech Recognition

35. StyleSpeech: Self-supervised Style Enhancing with VQ-VAE-based Pre-training for Expressive Audiobook Speech Synthesis

36. SimCalib: Graph Neural Network Calibration based on Similarity between Nodes

37. Injecting linguistic knowledge into BERT for Dialogue State Tracking

38. UniAudio: An Audio Foundation Model Toward Universal Audio Generation

39. Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts

40. Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning

41. QS-TTS: Towards Semi-Supervised Text-to-Speech Synthesis via Vector-Quantized Self-Supervised Speech Representation Learning

42. Improving Mandarin Prosodic Structure Prediction with Multi-level Contextual Information

43. Rethinking Machine Ethics -- Can LLMs Perform Moral Reasoning through the Lens of Moral Theories?

44. MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis

45. Unified Modeling of Multi-Talker Overlapped Speech Recognition and Diarization with a Sidecar Separator

46. SAIL: Search-Augmented Instruction Learning

47. Interpretable Unified Language Checking

48. A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition

49. Leveraging Pretrained Representations with Task-related Keywords for Alzheimer's Disease Detection

50. A Sidecar Separator Can Convert a Single-Talker Speech Recognition System to a Multi-Talker One

Catalog

Books, media, physical & digital resources