Abstract
Monaural source separation is important for many real world applications. It is challenging since only single channel information is available. In this paper, we explore using deep recurrent neural networks for singing voice separation from monaural recordings in a supervised setting. Deep recurrent neural networks with different temporal connections are explored. We propose jointly optimizing the networks for multiple source signals by including the separation step as a nonlinear operation in the last layer. Different discriminative training objectives are further explored to enhance the source to interference ratio. Our proposed system achieves the state-of-the-art performance, 2.30~2.48 dB GNSDR gain and 4.32~5.42 dB GSIR gain compared to previous models, on the MIR-1K dataset.
Original language | English (US) |
---|---|
Pages | 477-482 |
Number of pages | 6 |
State | Published - 2014 |
Event | 15th International Society for Music Information Retrieval Conference, ISMIR 2014 - Taipei, Taiwan, Province of China Duration: Oct 27 2014 → Oct 31 2014 |
Conference
Conference | 15th International Society for Music Information Retrieval Conference, ISMIR 2014 |
---|---|
Country/Territory | Taiwan, Province of China |
City | Taipei |
Period | 10/27/14 → 10/31/14 |
ASJC Scopus subject areas
- Music
- Information Systems