Abstract

The purpose of this paper is to provide insight into how speech is processed by the auditory system, by quantifying the nature of nonsense speech sound confusions. (1) The Miller and Nicely [J. Acoust. Soc. Am. 27(2), 338-352 (1955)] confusion matrix (CM) data are analyzed by plotting the CM elements Si,j(SNR) as a function of the signal-to-noise ratio (SNR). This allows for the robust clustering of perceptual feature (event) groups, not robustly defined by a single CM table, where clusters depend on the sound order. (2) The SNR is then re-expressed as an articulation index (AI), and used as the independent variable. The normalized log scores log(1-Si,i(AI)) and log(Si,j(AI)), j≠i, then become linear functions of AI, on log-error versus AI plots. This linear dependence may be interpreted as an extension of the band-independence model of Fletcher. (3) The model formula for the average score for the finite-alphabet case Pc(AI,H) = ∑i=1NSi,i/N is then modified to include the effect of entropy H. Due to the grouping of sounds with increased SNR (and AI), the sound-group entropy Hg plays a key role in this performance measure. (4) A parametric model for the confusions Si,j(AI, H g) is then described, which characterizes the confusions between competing sounds within a group.

Original languageEnglish (US)
Pages (from-to)2212-2223
Number of pages12
JournalJournal of the Acoustical Society of America
Volume117
Issue number4 I
DOIs
StatePublished - Apr 2005

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Fingerprint Dive into the research topics of 'Consonant recognition and the articulation index'. Together they form a unique fingerprint.

Cite this