Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks

Mahir Morshed, Mark Hasegawa-Johnson

Research output: Contribution to journalConference articlepeer-review

Abstract

A system for the lateral transfer of information from end-to-end neural networks recognizing articulatory feature classes to similarly structured networks recognizing phone tokens is here proposed. The system connects recurrent layers of feature detectors pre-trained on a base language to recurrent layers of a phone recognizer for a different target language, this inspired primarily by the progressive neural network scheme. Initial experiments used detectors trained on Bengali speech for four articulatory feature classes-consonant place, consonant manner, vowel height, and vowel backness-attached to phone recognizers for four other Asian languages (Javanese, Nepali, Sinhalese, and Sundanese). While these do not currently suggest consistent performance improvements across different low-resource settings for target languages, irrespective of their genealogic or phonological relatedness to Bengali, they do suggest the need for further trials with different language sets, altered data sources and data configurations, and slightly altered network setups.

Original languageEnglish (US)
Pages (from-to)2298-2302
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
Volume2022-September
DOIs
StatePublished - 2022
Event23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022 - Incheon, Korea, Republic of
Duration: Sep 18 2022Sep 22 2022

Keywords

  • articulatory feature detection
  • progressive neural networks
  • transfer learning

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks'. Together they form a unique fingerprint.

Cite this