Keyphrases
Automatic Speech Recognition
51%
Speech Recognition
38%
Audio
32%
Phoneme
27%
Prosody
22%
Hidden Markov Model
20%
Error Rate
20%
Target Language
20%
Speech Recognition System
19%
Gaussian Mixture Model
17%
Acoustics
16%
Mismatched Crowdsourcing
16%
Acoustic Modeling
16%
Cross-lingual
15%
Vowels
14%
Under-resourced Languages
14%
Acoustic Features
14%
Speech Corpus
14%
Consonants
13%
Word Error Rate
13%
Dysarthria
11%
Deep Neural Network
11%
Adaptation
11%
Acoustic Cues
11%
Recognition Accuracy
11%
Utterance
11%
Language Model
10%
Probabilistic Transcription
10%
Phonetics
10%
Acoustic Event Detection
10%
Support Vector Machine
9%
Transcriber
9%
Dysarthric Speech
9%
Mismatched Transcription
9%
American English
9%
Speech Technology
9%
Speech Signal
9%
Token
8%
Recognizer
8%
Hindi
8%
Training Data
8%
Unlabeled Data
8%
Distinctive Features
8%
Vector Representation
8%
Low-resource
8%
TIMIT
8%
Multi-task Learning
8%
Labeled Data
7%
Landmark-based
7%
Spectrogram
7%
Computer Science
Speech Recognition
100%
Speech Recognition System
31%
Gaussian Mixture Model
21%
Target Language
20%
Support Vector Machine
17%
Text To Speech
16%
Word Error Rate
16%
Language Modeling
16%
Annotation
13%
Deep Neural Network
13%
Acoustic Feature
12%
Speech Recognizer
12%
speech corpus
12%
Recognition Accuracy
12%
Convolutional Neural Network
11%
Distinctive Feature
11%
Speech Synthesis
11%
Mutual Information
10%
Recognizer
10%
Neural Network
10%
Training Data
10%
Experimental Result
10%
Spoken Language
10%
Phonology
9%
Unlabeled Data
9%
Spontaneous Speech
8%
Event Detection
8%
Information Retrieval
8%
maximum-likelihood
8%
Source Language
8%
Dysarthric Speech Recognition
8%
Multitask Learning
7%
Feature Extraction
7%
Human Speech Perception
7%
Large Language Model
6%
Language Understanding
6%
Art Performance
6%
Self-Supervised Learning
6%
Speech Understanding
6%
Transfer Learning
6%
Speech Processing
6%
Low-resource Languages
5%
Linear Discriminant Analysis
5%
Information Criterion
5%
Dynamic Bayesian Network
5%
Data Augmentation
5%
Representation Learning
5%
Visual Feature
5%
Machine Translation
5%
Native Language
5%