Non-negative hidden Markov modeling of audio with application to source separation

Gautham J. Mysore, Paris Smaragdis, Bhiksha Raj

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, there has been a great deal of work in modeling audio using non-negative matrix factorization and its probabilistic counterparts as they yield rich models that are very useful for source separation and automatic music transcription. Given a sound source, these algorithms learn a dictionary of spectral vectors to best explain it. This dictionary is however learned in a manner that disregards a very important aspect of sound, its temporal structure. We propose a novel algorithm, the non-negative hidden Markov model (N-HMM), that extends the aforementioned models by jointly learning several small spectral dictionaries as well as a Markov chain that describes the structure of changes between these dictionaries. We also extend this algorithm to the non-negative factorial hidden Markov model (N-FHMM) to model sound mixtures, and demonstrate that it yields superior performance in single channel source separation tasks.

Original languageEnglish (US)
Title of host publicationLatent Variable Analysis and Signal Separation - 9th International Conference, LVA/ICA 2010, Proceedings
Pages140-148
Number of pages9
DOIs
StatePublished - 2010
Externally publishedYes
Event9th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2010 - St. Malo, France
Duration: Sep 27 2010Sep 30 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume6365 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other9th International Conference on Latent Variable Analysis and Signal Separation, LVA/ICA 2010
Country/TerritoryFrance
CitySt. Malo
Period9/27/109/30/10

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Non-negative hidden Markov modeling of audio with application to source separation'. Together they form a unique fingerprint.

Cite this