Back to Search
Start Over
DIFFBAS: An Advanced Binaural Audio Synthesis Model Focusing on Binaural Differences Recovery.
- Source :
- Applied Sciences (2076-3417); Apr2024, Vol. 14 Issue 8, p3385, 15p
- Publication Year :
- 2024
-
Abstract
- Binaural audio synthesis (BAS) aims to restore binaural audio from mono signals obtained from the environment to enhance users' immersive experiences. It plays an essential role in building Augmented Reality and Virtual Reality environments. Existing deep neural network (DNN)-based BAS systems synthesize binaural audio by modeling the overall sound propagation processes from the source to the left and right ears, which encompass early decay, room reverberation, and head/ear-related filtering. However, this end-to-end modeling approach brings in the overfitting problem for BAS models when they are trained using a small and homogeneous data set. Moreover, existing losses cannot well supervise the training process. As a consequence, the accuracy of synthesized binaural audio is far from satisfactory on binaural differences. In this work, we propose a novel DNN-based BAS method, namely DIFFBAS, to improve the accuracy of synthesized binaural audio from the perspective of the interaural phase difference. Specifically, DIFFBAS is trained using the average signals of the left and right channels. To make the model learn the binaural differences, we propose a new loss named Interaural Phase Difference (IPD) loss to supervise the model training. Extensive experiments have been performed and the results demonstrate the effectiveness of the DIFFBAS model and the IPD loss. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 20763417
- Volume :
- 14
- Issue :
- 8
- Database :
- Complementary Index
- Journal :
- Applied Sciences (2076-3417)
- Publication Type :
- Academic Journal
- Accession number :
- 176881196
- Full Text :
- https://doi.org/10.3390/app14083385