High/Low Model for Scalable Multimicrophone Enhancement of Speech Mixtures

Ryan M. Corey, Andrew C. Singer

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Many speech separation and enhancement methods take advantage of time-frequency sparsity by assuming that only one speech source in a mixture has nonzero power at each time and frequency. This “on/off” model is valuable for systems with more sources than microphones, but many methods that use it do not benefit from the spatial diversity of systems with large numbers of microphones. This work considers the high/low model, in which one source is strongest at each time-frequency index but all sources have nonzero power. A time-varying enhancement method using the high/low model combines the benefits of sparsity and spatial diversity and scales automatically with the number of microphones, resembling a time-frequency mask for underdetermined systems and a linear filter for overdetermined systems. The model is demonstrated using real-room data with up to 10 speech signals and between 1 and 160 microphones.

Original languageEnglish (US)
Title of host publication29th European Signal Processing Conference, EUSIPCO 2021 - Proceedings
PublisherEuropean Signal Processing Conference, EUSIPCO
Number of pages5
ISBN (Electronic)9789082797060
StatePublished - 2021
Event29th European Signal Processing Conference, EUSIPCO 2021 - Dublin, Ireland
Duration: Aug 23 2021Aug 27 2021

Publication series

NameEuropean Signal Processing Conference
ISSN (Print)2219-5491


Conference29th European Signal Processing Conference, EUSIPCO 2021


  • Microphone arrays
  • Source separation
  • Speech enhancement

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Cite this