Dysarthric Speech Conformer: Adaptation for Sequence-to-Sequence Dysarthric Speech Recognition

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automatic Speech Recognition (ASR) holds immense potential to provide an effective interface for assistive technologies, but its performance remains unsatisfactory for people with speech impairments such as dysarthria. Existing ASR systems struggle to accurately recognize dysarthric speech due to the significant speaker variability in dysarthric speech and the scarcity of dysarthric datasets. In this study, we propose a two-phase adaptation pipeline based on the Conformer architecture that leverages typical speech to transfer to individualized ASR models for dysarthric speakers. ASR performance is evaluated for isolated words and continuous sentences, yielding an average Word Error Rate of 21.5% on the UASpeech dataset and 12.7% on the TORGO dataset. Selectively freezing decoder layers was more often successful than selectively freezing encoder layers, suggesting that optimal performance is achieved by focusing the adaptation on the acoustic information contained in the encoder.

Original languageEnglish (US)
Title of host publication2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Proceedings
EditorsBhaskar D Rao, Isabel Trancoso, Gaurav Sharma, Neelesh B. Mehta
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9798350368741
DOIs
StatePublished - 2025
Event2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025 - Hyderabad, India
Duration: Apr 6 2025Apr 11 2025

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025
Country/TerritoryIndia
CityHyderabad
Period4/6/254/11/25

Keywords

  • automatic speech recognition
  • dysarthria
  • speaker adaptation
  • transfer learning

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Dysarthric Speech Conformer: Adaptation for Sequence-to-Sequence Dysarthric Speech Recognition'. Together they form a unique fingerprint.

Cite this