Back to Search Start Over

Music auto-tagging using scattering transform and convolutional neural network with self-attention

Authors :
Zhijie Wang
Guangxiao Song
Fang Han
Xiaochun Gu
Shenyi Ding
Source :
Applied Soft Computing. 96:106702
Publication Year :
2020
Publisher :
Elsevier BV, 2020.

Abstract

As a branch of machine learning, deep learning has been used for tackling with the music auto-tagging problem. Deep learning methods, especially those with convolutional neural network (CNN) architecture, have exhibited good performance on this multi-label classification task. However, the feature extracting part and preprocessing part of this architecture need to be improved. In this paper, we propose a deep-learning model based on CNN with scattering transform and self-attention mechanism for music automatic tagging. To get a balance between information integrity and feature extraction in the preprocessing phase, we employ the scattering transform. Then, a multi-layer CNN is used to extract higher-level features from the scattering coefficients. In order to select better receptive fields of the CNN, self-attention sub-network is appended at the last layer of CNN. Experimental results on the MagnaTagATune dataset and Million Song Dataset (MSD) show the proposed model is a good choice for music auto-tagging task, since the scores of the area under the receiver operating characteristic curve (ROC-AUC) and the area under the precision–recall curve (PR-AUC) obtained in this paper surpass the state-of-the-art models. Furthermore, we visualize the distributions of attention weights, activations of the CNN and ROC-AUC scores on each tag for better understanding of the model.

Details

ISSN :
15684946
Volume :
96
Database :
OpenAIRE
Journal :
Applied Soft Computing
Accession number :
edsair.doi...........4059d8c2832bb8ed60480d884924a8ab