401. Focus on the Sound around You: Monaural Target Speaker Extraction via Distance and Speaker Information
- Author
-
Lin, Jiuxin, Wang, Peng, Dinkel, Heinrich, Chen, Jun, Wu, Zhiyong, Yan, Zhiyong, Wang, Yongqing, Zhang, Junbo, and Wang, Yujun
- Subjects
FOS: Computer and information sciences ,Sound (cs.SD) ,Audio and Speech Processing (eess.AS) ,FOS: Electrical engineering, electronic engineering, information engineering ,Computer Science - Sound ,Electrical Engineering and Systems Science - Audio and Speech Processing - Abstract
Previously, Target Speaker Extraction (TSE) has yielded outstanding performance in certain application scenarios for speech enhancement and source separation. However, obtaining auxiliary speaker-related information is still challenging in noisy environments with significant reverberation. inspired by the recently proposed distance-based sound separation, we propose the near sound (NS) extractor, which leverages distance information for TSE to reliably extract speaker information without requiring previous speaker enrolment, called speaker embedding self-enrollment (SESE). Full- & sub-band modeling is introduced to enhance our NS-Extractor's adaptability towards environments with significant reverberation. Experimental results on several cross-datasets demonstrate the effectiveness of our improvements and the excellent performance of our proposed NS-Extractor in different application scenarios., Accepted by InterSpeech2023
- Published
- 2023
- Full Text
- View/download PDF