TY - GEN
T1 - MaD TwinNet: Masker-Denoiser Architecture with Twin Networks for Monaural Sound Source Separation
AU - Drossos, Konstantinos
AU - Mimilakis, Stylianos Ioannis
AU - Serdyuk, Dmitriy
AU - Schuller, Gerald
AU - Virtanen, Tuomas
AU - Bengio, Yoshua
PY - 2018/7/10
Y1 - 2018/7/10
N2 - Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current state of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel recurrent neural approach that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.
AB - Monaural singing voice separation task focuses on the prediction of the singing voice from a single channel music mixture signal. Current state of the art (SOTA) results in monaural singing voice separation are obtained with deep learning based methods. In this work we present a novel recurrent neural approach that learns long-term temporal patterns and structures of a musical piece. We build upon the recently proposed Masker-Denoiser (MaD) architecture and we enhance it with the Twin Networks, a technique to regularize a recurrent generative network using a backward running copy of the network. We evaluate our method using the Demixing Secret Dataset and we obtain an increment to signal-to-distortion ratio (SDR) of 0.37 dB and to signal-to-interference ratio (SIR) of 0.23 dB, compared to previous SOTA results.
UR - https://github.com/dr-costas/mad-twinnet
UR - http://arg.cs.tut.fi/demo/mad-twinnet/
UR - https://zenodo.org/record/1164592
UR - https://zenodo.org/record/1164585
U2 - 10.1109/IJCNN.2018.8489565
DO - 10.1109/IJCNN.2018.8489565
M3 - Conference contribution
SN - 978-1-5090-6015-3
BT - 2018 International Joint Conference on Neural Networks (IJCNN)
PB - IEEE
T2 - International Joint Conference on Neural Networks
Y2 - 1 January 1900
ER -