SALSA-LITE: A FAST AND EFFECTIVE FEATURE FOR POLYPHONIC SOUND EVENT LOCALIZATION AND DETECTION WITH MICROPHONE ARRAYS

Thi Ngoc Tho Nguyen, Douglas L. Jones, Karn N. Watcharasupat, Huy Phan, Woon Seng Gan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Polyphonic sound event localization and detection (SELD) has many practical applications in acoustic sensing and monitoring. However, the development of real-time SELD has been limited by the demanding computational requirement of most recent SELD systems. In this work, we introduce SALSA-Lite, a fast and effective feature for polyphonic SELD using microphone array inputs. SALSA-Lite is a lightweight variation of a previously proposed SALSA feature for polyphonic SELD. SALSA, which stands for Spatial Cue-Augmented Log-Spectrogram, consists of multichannel log-spectrograms stacked channelwise with the normalized principal eigenvectors of the spectrotemporally corresponding spatial covariance matrices. In contrast to SALSA, which uses eigenvector-based spatial features, SALSA-Lite uses normalized inter-channel phase differences as spatial features, allowing a 30-fold speedup compared to the original SALSA feature. Experimental results on the TAU-NIGENS Spatial Sound Events 2021 dataset showed that the SALSA-Lite feature achieved competitive performance compared to the full SALSA feature, and significantly outperformed the traditional feature set of multichannel log-mel spectrograms with generalized cross-correlation spectra. Specifically, using SALSA-Lite features increased localization-dependent F1 score and class-dependent localization recall by 15 % and 5 %, respectively, compared to using multichannel log-mel spectrograms with generalized cross-correlation spectra.

Original languageEnglish (US)
Title of host publication2022 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2022 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages716-720
Number of pages5
ISBN (Electronic)9781665405409
DOIs
StatePublished - 2022
Event2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022 - Hybrid, Singapore
Duration: May 22 2022May 27 2022

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2022-May
ISSN (Print)1520-6149

Conference

Conference2022 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022
Country/TerritorySingapore
CityHybrid
Period5/22/225/27/22

Keywords

  • Feature extraction
  • detection
  • microphone array
  • sound event localization

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'SALSA-LITE: A FAST AND EFFECTIVE FEATURE FOR POLYPHONIC SOUND EVENT LOCALIZATION AND DETECTION WITH MICROPHONE ARRAYS'. Together they form a unique fingerprint.

Cite this