Dynamic non-negative models for audio source separation

Paris Smaragdis, Gautham Mysore, Nasser Mohammadiha

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

As seen so far, non-negative models can be quite powerful when it comes to resolving mixtures of sounds. However, in such models we often ignore temporal information, instead focusing on resolving each incoming spectrum independently. In this chapter we will present some methods that learn to incorporate the temporal aspects of sounds and use that information to perform improved separation. We will show three such models, a conlvolutive model that learns fixed temporal features, a hidden Markov model that learns state transitions and can incorporate language information, and finally a continuous dynamical model that learns how sounds evolve over time and is able to resolve cases where static information is not enough.

Original languageEnglish (US)
Title of host publicationSignals and Communication Technology
PublisherSpringer
Pages49-71
Number of pages23
DOIs
StatePublished - Jan 1 2018

Publication series

NameSignals and Communication Technology
ISSN (Print)1860-4862
ISSN (Electronic)1860-4870

Fingerprint

Source separation
Acoustic waves
Information use
Hidden Markov models

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Computer Networks and Communications

Cite this

Smaragdis, P., Mysore, G., & Mohammadiha, N. (2018). Dynamic non-negative models for audio source separation. In Signals and Communication Technology (pp. 49-71). (Signals and Communication Technology). Springer. https://doi.org/10.1007/978-3-319-73031-8_3

Dynamic non-negative models for audio source separation. / Smaragdis, Paris; Mysore, Gautham; Mohammadiha, Nasser.

Signals and Communication Technology. Springer, 2018. p. 49-71 (Signals and Communication Technology).

Research output: Chapter in Book/Report/Conference proceedingChapter

Smaragdis, P, Mysore, G & Mohammadiha, N 2018, Dynamic non-negative models for audio source separation. in Signals and Communication Technology. Signals and Communication Technology, Springer, pp. 49-71. https://doi.org/10.1007/978-3-319-73031-8_3
Smaragdis P, Mysore G, Mohammadiha N. Dynamic non-negative models for audio source separation. In Signals and Communication Technology. Springer. 2018. p. 49-71. (Signals and Communication Technology). https://doi.org/10.1007/978-3-319-73031-8_3
Smaragdis, Paris ; Mysore, Gautham ; Mohammadiha, Nasser. / Dynamic non-negative models for audio source separation. Signals and Communication Technology. Springer, 2018. pp. 49-71 (Signals and Communication Technology).
@inbook{b2470cef081e4b09a4db583b658cb800,
title = "Dynamic non-negative models for audio source separation",
abstract = "As seen so far, non-negative models can be quite powerful when it comes to resolving mixtures of sounds. However, in such models we often ignore temporal information, instead focusing on resolving each incoming spectrum independently. In this chapter we will present some methods that learn to incorporate the temporal aspects of sounds and use that information to perform improved separation. We will show three such models, a conlvolutive model that learns fixed temporal features, a hidden Markov model that learns state transitions and can incorporate language information, and finally a continuous dynamical model that learns how sounds evolve over time and is able to resolve cases where static information is not enough.",
author = "Paris Smaragdis and Gautham Mysore and Nasser Mohammadiha",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-319-73031-8_3",
language = "English (US)",
series = "Signals and Communication Technology",
publisher = "Springer",
pages = "49--71",
booktitle = "Signals and Communication Technology",

}

TY - CHAP

T1 - Dynamic non-negative models for audio source separation

AU - Smaragdis, Paris

AU - Mysore, Gautham

AU - Mohammadiha, Nasser

PY - 2018/1/1

Y1 - 2018/1/1

N2 - As seen so far, non-negative models can be quite powerful when it comes to resolving mixtures of sounds. However, in such models we often ignore temporal information, instead focusing on resolving each incoming spectrum independently. In this chapter we will present some methods that learn to incorporate the temporal aspects of sounds and use that information to perform improved separation. We will show three such models, a conlvolutive model that learns fixed temporal features, a hidden Markov model that learns state transitions and can incorporate language information, and finally a continuous dynamical model that learns how sounds evolve over time and is able to resolve cases where static information is not enough.

AB - As seen so far, non-negative models can be quite powerful when it comes to resolving mixtures of sounds. However, in such models we often ignore temporal information, instead focusing on resolving each incoming spectrum independently. In this chapter we will present some methods that learn to incorporate the temporal aspects of sounds and use that information to perform improved separation. We will show three such models, a conlvolutive model that learns fixed temporal features, a hidden Markov model that learns state transitions and can incorporate language information, and finally a continuous dynamical model that learns how sounds evolve over time and is able to resolve cases where static information is not enough.

UR - http://www.scopus.com/inward/record.url?scp=85063224425&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063224425&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-73031-8_3

DO - 10.1007/978-3-319-73031-8_3

M3 - Chapter

AN - SCOPUS:85063224425

T3 - Signals and Communication Technology

SP - 49

EP - 71

BT - Signals and Communication Technology

PB - Springer

ER -