MTL Music Representation, data underlying the publication: One deep music representation to rule them all? A comparative analysis of different representation learning strategies

posted on 13.03.2019 by Jaehun Kim, J. (Julián) Urbano, C.C.S. (Cynthia) Liem, A. (Alan) Hanjalic
MTL Music Representation dataset is the collection of 384 neural network that are trained on 8 learning tasks and datasets (learning sources) from music domain. The data used in the training consists of subset of the Million Song Dataset (MSD). The neural network architecture is based on the VGG architecture. To host multiple learning sources, we adopted multi-task architecture where the task-specific layers branches out from the shared layer. Main dataset file consists of multiple directories, where the model checkpoint and the learning curve data is saved in two separate files. Each model parameter is saved in compressed binary file serialized by the `Pytorch` python package. Each learning curve data is saved in `.csv` file with the model idenfier, where each row indicates individual record for the loss function for either training or validation. We are planning to provide a number of utilities for instance: 1) extracting features for given audio file 2) visualizing and save the learning curves. For more information, please visit our github page.



TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science, Department of Intelligent Systems, Multimedia Computing Group


4TU.Centre for Research Data


media types: application/octet-stream, application/zip, text/csv, text/markdown, text/plain