Sudo RM -RF: Efficient networks for universal audio source separation

Efthymios Tzinis, Zhepei Wang, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present an efficient neural network for end-to-end general purpose audio source separation. Specifically, the backbone structure of this convolutional network is the SUccessive DOwnsampling and Resampling of Multi-Resolution Features (SuDoRM-RF) as well as their aggregation which is performed through simple one-dimensional convolutions. In this way, we are able to obtain high quality audio source separation with limited number of floating point operations, memory requirements, number of parameters and latency. Our experiments on both speech and environmental sound separation datasets show that SuDoRM - RF performs comparably and even surpasses various state-of-the-art approaches with significantly higher computational resource requirements.

Original languageEnglish (US)
Title of host publicationProceedings of the 2020 IEEE 30th International Workshop on Machine Learning for Signal Processing, MLSP 2020
PublisherIEEE Computer Society
ISBN (Electronic)9781728166629
DOIs
StatePublished - Sep 2020
Externally publishedYes
Event30th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2020 - Virtual, Espoo, Finland
Duration: Sep 21 2020Sep 24 2020

Publication series

NameIEEE International Workshop on Machine Learning for Signal Processing, MLSP
Volume2020-September
ISSN (Print)2161-0363
ISSN (Electronic)2161-0371

Conference

Conference30th IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2020
Country/TerritoryFinland
CityVirtual, Espoo
Period9/21/209/24/20

Keywords

  • Audio source separation
  • Deep learning
  • Low-cost neural networks

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing

Fingerprint

Dive into the research topics of 'Sudo RM -RF: Efficient networks for universal audio source separation'. Together they form a unique fingerprint.

Cite this