Adaptive denoising autoencoders: A fine-tuning scheme to learn from test mixtures

Minje Kim, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder (DAE) in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show sensible denoising performance when the nature of artifacts is known in advance, the scalability of an already-trained network to an unseen signal with an unknown characteristic of deformation is not well studied. To handle this problem, we propose an adaptive fine-tuning scheme where we define a test-time target variables so that a DAE can learn from the newly available sources and the mixing environments in the test mixtures. In the proposed network topology, we stack an AutoEncoder (AE) trained from clean source spectra of interest on top of a DAE trained from a variety of available mixture spectra. Hence, the bottom DAE outputs are used as the input to the top AE, which is to check the purity of the once denoised DAE output. Then, the top AE error is used to fine-tune the bottom DAE during the test phase. Experimental results on audio source separation tasks demonstrate that the proposed fine-tuning technique can further improve the sound quality of a DAE during the test procedure.

Original languageEnglish (US)
Title of host publicationLatent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings
EditorsZbynĕk Koldovský, Emmanuel Vincent, Arie Yeredor, Petr Tichavský
PublisherSpringer-Verlag
Pages100-107
Number of pages8
ISBN (Print)9783319224817
DOIs
StatePublished - Jan 1 2015
Event12th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2015 - Liberec, Czech Republic
Duration: Aug 25 2015Aug 28 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9237
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2015
CountryCzech Republic
CityLiberec
Period8/25/158/28/15

Fingerprint

Source separation
Denoising
Tuning
Scalability
Topology
Acoustic waves
Source Separation
Output
Network Topology
Unknown
Target
Deep learning
Experimental Results

Keywords

  • Autoencoders
  • Deep learning
  • Deep neural networks
  • Semi-supervised separation
  • Speech enhancement

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Kim, M., & Smaragdis, P. (2015). Adaptive denoising autoencoders: A fine-tuning scheme to learn from test mixtures. In Z. Koldovský, E. Vincent, A. Yeredor, & P. Tichavský (Eds.), Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings (pp. 100-107). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9237). Springer-Verlag. https://doi.org/10.1007/978-3-319-22482-4_12

Adaptive denoising autoencoders : A fine-tuning scheme to learn from test mixtures. / Kim, Minje; Smaragdis, Paris.

Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings. ed. / Zbynĕk Koldovský; Emmanuel Vincent; Arie Yeredor; Petr Tichavský. Springer-Verlag, 2015. p. 100-107 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9237).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kim, M & Smaragdis, P 2015, Adaptive denoising autoencoders: A fine-tuning scheme to learn from test mixtures. in Z Koldovský, E Vincent, A Yeredor & P Tichavský (eds), Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 9237, Springer-Verlag, pp. 100-107, 12th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2015, Liberec, Czech Republic, 8/25/15. https://doi.org/10.1007/978-3-319-22482-4_12
Kim M, Smaragdis P. Adaptive denoising autoencoders: A fine-tuning scheme to learn from test mixtures. In Koldovský Z, Vincent E, Yeredor A, Tichavský P, editors, Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings. Springer-Verlag. 2015. p. 100-107. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-319-22482-4_12
Kim, Minje ; Smaragdis, Paris. / Adaptive denoising autoencoders : A fine-tuning scheme to learn from test mixtures. Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings. editor / Zbynĕk Koldovský ; Emmanuel Vincent ; Arie Yeredor ; Petr Tichavský. Springer-Verlag, 2015. pp. 100-107 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{94865b197ac143fcb9d07ddc167a6644,
title = "Adaptive denoising autoencoders: A fine-tuning scheme to learn from test mixtures",
abstract = "This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder (DAE) in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show sensible denoising performance when the nature of artifacts is known in advance, the scalability of an already-trained network to an unseen signal with an unknown characteristic of deformation is not well studied. To handle this problem, we propose an adaptive fine-tuning scheme where we define a test-time target variables so that a DAE can learn from the newly available sources and the mixing environments in the test mixtures. In the proposed network topology, we stack an AutoEncoder (AE) trained from clean source spectra of interest on top of a DAE trained from a variety of available mixture spectra. Hence, the bottom DAE outputs are used as the input to the top AE, which is to check the purity of the once denoised DAE output. Then, the top AE error is used to fine-tune the bottom DAE during the test phase. Experimental results on audio source separation tasks demonstrate that the proposed fine-tuning technique can further improve the sound quality of a DAE during the test procedure.",
keywords = "Autoencoders, Deep learning, Deep neural networks, Semi-supervised separation, Speech enhancement",
author = "Minje Kim and Paris Smaragdis",
year = "2015",
month = "1",
day = "1",
doi = "10.1007/978-3-319-22482-4_12",
language = "English (US)",
isbn = "9783319224817",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag",
pages = "100--107",
editor = "Zbynĕk Koldovsk{\'y} and Emmanuel Vincent and Arie Yeredor and Petr Tichavsk{\'y}",
booktitle = "Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings",

}

TY - GEN

T1 - Adaptive denoising autoencoders

T2 - A fine-tuning scheme to learn from test mixtures

AU - Kim, Minje

AU - Smaragdis, Paris

PY - 2015/1/1

Y1 - 2015/1/1

N2 - This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder (DAE) in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show sensible denoising performance when the nature of artifacts is known in advance, the scalability of an already-trained network to an unseen signal with an unknown characteristic of deformation is not well studied. To handle this problem, we propose an adaptive fine-tuning scheme where we define a test-time target variables so that a DAE can learn from the newly available sources and the mixing environments in the test mixtures. In the proposed network topology, we stack an AutoEncoder (AE) trained from clean source spectra of interest on top of a DAE trained from a variety of available mixture spectra. Hence, the bottom DAE outputs are used as the input to the top AE, which is to check the purity of the once denoised DAE output. Then, the top AE error is used to fine-tune the bottom DAE during the test phase. Experimental results on audio source separation tasks demonstrate that the proposed fine-tuning technique can further improve the sound quality of a DAE during the test procedure.

AB - This work aims at a test-time fine-tune scheme to further improve the performance of an already-trained Denoising AutoEncoder (DAE) in the context of semi-supervised audio source separation. Although the state-of-the-art deep learning-based DAEs show sensible denoising performance when the nature of artifacts is known in advance, the scalability of an already-trained network to an unseen signal with an unknown characteristic of deformation is not well studied. To handle this problem, we propose an adaptive fine-tuning scheme where we define a test-time target variables so that a DAE can learn from the newly available sources and the mixing environments in the test mixtures. In the proposed network topology, we stack an AutoEncoder (AE) trained from clean source spectra of interest on top of a DAE trained from a variety of available mixture spectra. Hence, the bottom DAE outputs are used as the input to the top AE, which is to check the purity of the once denoised DAE output. Then, the top AE error is used to fine-tune the bottom DAE during the test phase. Experimental results on audio source separation tasks demonstrate that the proposed fine-tuning technique can further improve the sound quality of a DAE during the test procedure.

KW - Autoencoders

KW - Deep learning

KW - Deep neural networks

KW - Semi-supervised separation

KW - Speech enhancement

UR - http://www.scopus.com/inward/record.url?scp=84944679475&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84944679475&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-22482-4_12

DO - 10.1007/978-3-319-22482-4_12

M3 - Conference contribution

AN - SCOPUS:84944679475

SN - 9783319224817

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 100

EP - 107

BT - Latent Variable Analysis and Signal Separation - 12th International Conference, LVA/ICA 2015, Proceedings

A2 - Koldovský, Zbynĕk

A2 - Vincent, Emmanuel

A2 - Yeredor, Arie

A2 - Tichavský, Petr

PB - Springer-Verlag

ER -