fma

FMA: A Dataset For Music Analysis

  • Owner: oppa3109/fma
  • Platform:
  • License:: MIT License
  • Category::
  • Topic:
  • Like:
    0
      Compare:

Github stars Tracking Chart

FMA: A Dataset For Music Analysis

Kirell Benzi, Michaël Defferrard,
Pierre Vandergheynst,
Xavier Bresson,
EPFL LTS2.

Note that this is a beta release and that this repository as well as the
paper and data are subject to change. Stay tuned!

Data

The dataset is a dump of the Free Music Archive.
You got various sizes:

  1. Small: 4,000 clips of
    30 seconds, 10 balanced genres (GTZAN-like) (~3.4 GiB)
  2. Medium: 14,511 clips
    of 30 seconds, 20 unbalanced genres (~12.2 GiB)
  3. Large (available soon): 77,643 clips of 30 seconds, 68 unbalanced genres
    (~90 GiB)
  4. Huge (subject to distribution constraints): 77,643 untrimmed clips, 68
    unbalanced genres (~900 GiB)

Notes:

  • All datasets come with MP3 audio (128 kbps, 44.1 kHz, stereo) of all clips.
  • All datasets come with the following meta-data about each clip: artist,
    title, list of genres (and top genre), play count.
  • Meta-data about all clips are stored in a JSON file to be loaded as a
    pandas dataframe.
  • As additional audio meta-data, each clip of datasets 1 and 2 come with all
    Echonest features.
  • Please see the paper for a description of how the data was collected and
    cleaned.

Code

This repository features the following notebooks:

  1. Generation: generation of the datasets.
  2. Analysis: loading and basic analysis of the data.
  3. Baselines: baseline models for various tasks.
  4. Usage: how to load the datasets and train your own models.

Installation

# Install Python 3.6 and create a virtual environment.
pyenv install 3.6.0
pyenv virtualenv 3.6.0 fma
pyenv activate fma

# Clone the repository.
git clone https://github.com/mdeff/fma.git
cd fma

# Install the dependencies.
make install

# Fill in the configuration.
cat .env
DATA_DIR=/path/to/fma_small

# Open the Jupyter notebook.
jupyter-notebook

# Or run a notebook.
make fma_baselines.ipynb

License

  • Please cite our paper if you use our code or data.
  • The code is released under the terms of the MIT license.
  • The dataset is meant for research only.
  • We are grateful to SWITCH and EPFL for hosting the dataset within the context
    of the SCALE-UP project, funded in
    part by the swissuniversities
    SUC P-2 program.

Main metrics

Overview
Name With Owneroppa3109/fma
Primary LanguageJupyter Notebook
Program languageJupyter Notebook (Language Count: 3)
Platform
License:MIT License
所有者活动
Created At2017-03-13 23:55:48
Pushed At2017-03-11 13:40:23
Last Commit At2017-03-11 13:38:41
Release Count0
用户参与
Stargazers Count0
Watchers Count0
Fork Count1
Commits Count32
Has Issues Enabled
Issues Count0
Issue Open Count0
Pull Requests Count0
Pull Requests Open Count0
Pull Requests Close Count0
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private