Neural network alternatives toconvolutive audio models for source separation

Shrikant Venkataramani, Cem Subakan, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Convolutive Non-Negative Matrix Factorization model factorizes a given audio spectrogram using frequency templates with a temporal dimension. In this paper, we present a convolutional auto-encoder model that acts as a neural network alternative to convolutive NMF. Using the modeling flexibility granted by neural networks, we also explore the idea of using a Recurrent Neural Network in the encoder. Experimental results on speech mixtures from TIMIT dataset indicate that the convolutive architecture provides a significant improvement in separation performance in terms of BSS eval metrics.

Original languageEnglish (US)
Title of host publication2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017 - Proceedings
EditorsNaonori Ueda, Jen-Tzung Chien, Tomoko Matsui, Jan Larsen, Shinji Watanabe
PublisherIEEE Computer Society
Pages1-6
Number of pages6
ISBN (Electronic)9781509063413
DOIs
StatePublished - Dec 5 2017
Event2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017 - Tokyo, Japan
Duration: Sep 25 2017Sep 28 2017

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2017-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Other

Other2017 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2017
Country/TerritoryJapan
CityTokyo
Period9/25/179/28/17

Keywords

  • Auto-encoders
  • Convolutive models
  • Deep learning
  • Source separation

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'Neural network alternatives toconvolutive audio models for source separation'. Together they form a unique fingerprint.

Cite this