Back to Search Start Over

Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models

Authors :
Ye Wang
Bidisha Sharma
Haizhou Li
Chitralekha Gupta
Source :
ICASSP
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

Lyrics-to-audio alignment is to automatically align the lyrical words with the mixed singing audio (singing voice+musical accompaniment). Such alignment can be achieved with an automatic speech recognition (ASR) system. We propose to adapt the acoustic model of a speech recognizer towards solo singing voice. This avoids the hurdles of annotating a large polyphonic music training dataset. Moreover, a lexicon-modification based duration modelling has been incorporated to account for the long duration vowels in singing. As practical application demand the alignment on polyphonic music, we study the effect of different singing vocal separation methods in the task of lyrics-to-audio alignment in polyphonic music. The extracted vocals are forced-aligned with the singing-adapted models. We demonstrate that the use of audio source separation method and effective end-pointing of the songs has a high impact on the alignment performance through the experiments. We report a mean average absolute error of 3.87 seconds, which is comparable with the state-of-the-art lyrics-to-audio alignment system that is trained on a large polyphonic music database.

Details

Database :
OpenAIRE
Journal :
ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Accession number :
edsair.doi...........1fc6c7c6d8a97b93e22060e5f10c47de
Full Text :
https://doi.org/10.1109/icassp.2019.8682582