Robust source localization and enhancement with a probabilistic steered response power model

Johannes Traa, David Wingate, Noah D. Stein, Paris Smaragdis

Research output: Contribution to journalArticlepeer-review

Abstract

Source localization and enhancement are often treated separately in the array processing literature. One can apply steered response power (SRP) localization to determine the sources' Directions-Of-Arrival (DOA) followed by beamforming and Wiener post-filtering to isolate the sources from each other and ambient interference. We show that when there is significant overlap between directional sources of interest in the time-frequency (TF) plane, traditional SRP localization breaks down. This may occur, for example, when the array is located near a reflector, significant early reflections are present, or the sources are harmonized. We propose a joint solution to the localization and enhancement problems via a probabilistic interpretation of the SRP function. We formulate optimization procedures for (1) a mixture of single-source SRP distributions (MoSRP) and (2) a multi-source SRP distribution (MultSRP). Unlike in traditional localization, the latter approach explicitly models source overlap in the TF plane. Results shows that the MultSRP model is capable of localizing sources with significant overlap in the TF domain and that either of the proposed methods out-performs standard SRP localization for multiple speakers.

Original languageEnglish (US)
Pages (from-to)493-503
Number of pages11
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume24
Issue number3
DOIs
StatePublished - Mar 2016

Keywords

  • Beamforming
  • Blind source separation
  • Source localization
  • Steered response power

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Robust source localization and enhancement with a probabilistic steered response power model'. Together they form a unique fingerprint.

Cite this