Transfer Learning for Music Genre Classification
Abstract
Modern music information retrieval system provides high-level features (genre, instrument, mood and so on) for searching and recommending conveniently. Among these music tags, genre is the most widely used in practice. Machine learning technique has the ability of cataloguing different genres from raw music. A disadvantage of it is that the final performance heavily depends on the used features. As a powerful learning algorithm, deep neural network can extract useful features automatically and effectively instead of time-consuming feature engineering. But deeper architecture means larger data are needed to train the neural network. In many cases, we may not have enough data to train a deep network. Transfer learning solves the problem by pre-training the network in a similar task which has enough data, then fine-tuning the parameters of the pre-trained network using the target dataset. Magnatagatune dataset is used for pre-training the proposed five-layer Recurrent Neural Network (RNN) with Gated Recurrent Unit (GRU). And in order to reduce the input of the network, scattering transform is used in this paper. Then GTZAN dataset is used as the target dataset of genre classification. Experimental results show the transfer learning way can achieve a higher average classification accuracy (95.8%) than the same deep RNN which initials the parameters randomly (93.5%). In addition, the deep RNN using transfer learning converges to the final accuracy faster than using random initialization.
Domains
Computer Science [cs]Origin | Files produced by the author(s) |
---|
Loading...