Recognizing speech from simultaneous speakers

Bhiksha Raj, Rita Singh, Paris Smaragdis

Research output: Contribution to conferencePaper

Abstract

In this paper we present and evaluate factored methods for recognition of simultaneous speech from multiple speakers in single-channel recordings. Factored methods decompose the problem of jointly recognizing the speech from each of the speakers by separately recognizing the speech from each speaker. In order to achieve this, the signal components of the target speaker in each case must be enhanced in some manner. We do this in two ways: using an NMF-based speaker separation algorithm that generates separated spectra for each speaker, and a mask estimation method that generates spectral masks for each speaker that must be used in conjunction with a missing-feature method that can recognize speech from partial spectral data. Experiments on synthetic mixtures of signals from the Wall Street Journal corpus show that both approaches can greatly improve the recognition of the individual signals in the mixture.

Original languageEnglish (US)
Pages3317-3320
Number of pages4
StatePublished - Dec 1 2005
Externally publishedYes
Event9th European Conference on Speech Communication and Technology - Lisbon, Portugal
Duration: Sep 4 2005Sep 8 2005

Other

Other9th European Conference on Speech Communication and Technology
CountryPortugal
CityLisbon
Period9/4/059/8/05

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'Recognizing speech from simultaneous speakers'. Together they form a unique fingerprint.

  • Cite this

    Raj, B., Singh, R., & Smaragdis, P. (2005). Recognizing speech from simultaneous speakers. 3317-3320. Paper presented at 9th European Conference on Speech Communication and Technology, Lisbon, Portugal.