Elena Cabrio, Fabien Gandon, Yaroslav Nechaev, Michael Fell, Geoffroy Peeters, Gabriel Meseguer-Brocal, Web-Instrumented Man-Machine Interactions, Communities and Semantics (WIMMICS), Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Scalable and Pervasive softwARe and Knowledge Systems (Laboratoire I3S - SPARKS), Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S), Université Nice Sophia Antipolis (1965 - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA), Amazon, Cambridge, MA, USA, Institut de Recherche et Coordination Acoustique/Musique (IRCAM), Sorbonne Université (SU), Centre National de la Recherche Scientifique (CNRS), Laboratoire Traitement et Communication de l'Information (LTCI), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (... - 2019) (UNS), Fondazione Bruno Kessler [Trento, Italy] (FBK), Département Images, Données, Signal (IDS), Télécom ParisTech, Signal, Statistique et Apprentissage (S2A), Institut Mines-Télécom [Paris] (IMT)-Télécom Paris-Institut Mines-Télécom [Paris] (IMT)-Télécom Paris, and Institut Polytechnique de Paris (IP Paris)
Song lyrics contain repeated patterns that have been proven to facilitate automated lyrics segmentation, with the final goal of detecting the building blocks (e.g., chorus, verse) of a song text. Our contribution in this article is twofold. First, we introduce a convolutional neural network (CNN)-based model that learns to segment the lyrics based on their repetitive text structure. We experiment with novel features to reveal different kinds of repetitions in the lyrics, for instance based on phonetical and syntactical properties. Second, using a novel corpus where the song text is synchronized to the audio of the song, we show that the text and audio modalities capture complementary structure of the lyrics and that combining both is beneficial for lyrics segmentation performance. For the purely text-based lyrics segmentation on a dataset of 103k lyrics, we achieve an F-score of 67.4%, improving on the state of the art (59.2% F-score). On the synchronized text–audio dataset of 4.8k songs, we show that the additional audio features improve segmentation performance to 75.3% F-score, significantly outperforming the purely text-based approaches.