Description
This dataset contains the aligned scores for around 85% of the songs in SynthSOD, which we used to train models for score-informed music source separation [1]. We obtained the score information from the original MIDIs used to synthesize SynthSOD and aligned them using the system described in [2].
The score information is encoded in text files including the start and end time of every note, the MIDI pitch, and the MIDI instrument for every note. The correspondence between the MIDI instrument number and the instrument names in SynthSOD is based on the General MIDI standard (see below). We also provide metadata files in the same format as the ones provided with SynthSOD excluding the songs whose scores we weren't able to align.
If you find this dataset useful for your research, please consider citing the paper where we introduced it [1] and the original paper of SynthSOD [3].
About version 2
After publishing the initial version of the dataset, we found that some of the notes present in the scores weren't present in the synthesized audios from SynthSOD. This corresponded to notes with a MIDI pitch out of the range accepted by the synthesizer for that instrument, affecting especially percussion instruments. In the second version of the dataset, we have removed these notes from the scores, so they correspond with the audio files from SynthSOD. The results reported in the paper were obtained by training and evaluating the models with this new version.
SynthSOD instrument name
MIDI instrument number
Violin, Violin_1, Violin_2
40
Viola
41
Cello
42
Bass
43
Oboe
68
coranglais
69
Bassoon
70
Clarinet
71
Piccolo
72
Flute
73
Trumpet
56
Trombone
57
Tuba
58
Horn
60
Harp
46
Timpani
47
untunedpercussion
0
The score information is encoded in text files including the start and end time of every note, the MIDI pitch, and the MIDI instrument for every note. The correspondence between the MIDI instrument number and the instrument names in SynthSOD is based on the General MIDI standard (see below). We also provide metadata files in the same format as the ones provided with SynthSOD excluding the songs whose scores we weren't able to align.
If you find this dataset useful for your research, please consider citing the paper where we introduced it [1] and the original paper of SynthSOD [3].
About version 2
After publishing the initial version of the dataset, we found that some of the notes present in the scores weren't present in the synthesized audios from SynthSOD. This corresponded to notes with a MIDI pitch out of the range accepted by the synthesizer for that instrument, affecting especially percussion instruments. In the second version of the dataset, we have removed these notes from the scores, so they correspond with the audio files from SynthSOD. The results reported in the paper were obtained by training and evaluating the models with this new version.
SynthSOD instrument name
MIDI instrument number
Violin, Violin_1, Violin_2
40
Viola
41
Cello
42
Bass
43
Oboe
68
coranglais
69
Bassoon
70
Clarinet
71
Piccolo
72
Flute
73
Trumpet
56
Trombone
57
Tuba
58
Horn
60
Harp
46
Timpani
47
untunedpercussion
0
| Date made available | 2 Jun 2025 |
|---|---|
| Publisher | Zenodo |
Funding
| Funders | Funder number |
|---|---|
| European Commission | 101095065 |
Field of science, Statistics Finland
- 113 Computer and information sciences
Cite this
- DataSetCite