Detecting audio attacks on ASR systems with dropout uncertainty

Tejas Jayashankar, Jonathan Le Roux, Pierre Moulin

Research output: Contribution to journalConference articlepeer-review

Abstract

Various adversarial audio attacks have recently been developed to fool automatic speech recognition (ASR) systems. We here propose a defense against such attacks based on the uncertainty introduced by dropout in neural networks. We show that our defense is able to detect attacks created through optimized perturbations and frequency masking on a state-of-the-art end-to-end ASR system. Furthermore, the defense can be made robust against attacks that are immune to noise reduction. We test our defense on Mozilla's CommonVoice dataset, the UrbanSound dataset, and an excerpt of the LibriSpeech dataset, showing that it achieves high detection accuracy in a wide range of scenarios.

Original languageEnglish (US)
Pages (from-to)4671-4675
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2020-October
DOIs
StatePublished - 2020
Event21st Annual Conference of the International Speech Communication Association, INTERSPEECH 2020 - Shanghai, China
Duration: Oct 25 2020Oct 29 2020

Keywords

  • Adversarial machine learning
  • Audio attack
  • Automatic speech recognition
  • Dropout
  • Noise reduction
  • Uncertainty distribution

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'Detecting audio attacks on ASR systems with dropout uncertainty'. Together they form a unique fingerprint.

Cite this