Boosted Locality Sensitive Hashing: Discriminative, Efficient, and Scalable Binary Codes for Source Separation

Sunwoo Kim, Minje Kim

Research output: Contribution to journalArticlepeer-review

Abstract

We propose a novel adaptive boosting approach to learn discriminative binary hash codes, boosted locality sensitive hashing (BLSH), that can represent audio spectra efficiently. We aim to use the learned hash codes in the single-channel speech denoising task by designing a nearest neighborhood search method that operates in the hashed feature space. To achieve the optimal denoising results given the highly compact binary feature representation, our proposed BLSH algorithm learns simple logistic regressors as the weak learners in an incremental way (i.e., one by one) so that each weak learner is trained to complement the mistake its predecessors have made. Upon testing, their binary classification results transform each spectrum of noisy speech into a bit string, where the bits are ordered based on their significance, adding scalability to the denoising system. Simple bitwise operations calculate Hamming distance to find the <italic>K</italic>-nearest matching hashed frames in the dictionary of training noisy speech spectra, whose associated ideal binary masks are averaged to estimate the denoising mask for that test mixture. In contrast to the locality sensitive hashing method&#x0027;s random projections, our proposed supervised learning algorithm trains the projections such that the distance between the self-similarity matrix of the hash codes and that of the original spectra is minimized. Likewise, the process conceptually aligns to the Adaboost algorithm, although ours is specialized in learning binary features for source separation rather than classification. Experimental results on speech denoising suggest that the BLSH algorithm learns more discriminative representations than Fourier or mel spectra and the nonlinear kernels derived from them. Our compact binary representation is expected to facilitate model deployment onto resource-constrained environments, where comprehensive models (e.g., deep neural networks) are unaffordable.

Original languageEnglish (US)
Pages (from-to)1-14
Number of pages14
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume30
DOIs
StatePublished - 2022
Externally publishedYes

Keywords

  • AdaBoost
  • Codes
  • Dictionaries
  • Kernel Methods
  • Locality Sensitive Hashing
  • Noise reduction
  • Source separation
  • Speech coding
  • Speech Enhancement
  • Speech processing
  • Training

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Boosted Locality Sensitive Hashing: Discriminative, Efficient, and Scalable Binary Codes for Source Separation'. Together they form a unique fingerprint.

Cite this