Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints

Xiang Kong, Preethi Jyothi, Mark Allan Hasegawa-Johnson

Research output: Contribution to journalConference article

Abstract

This article describes a method for reducing the error rate of probabilistic phone-based transcriptions resulting from mismatched crowdsourcing by using language-specific constraints to post-process the phone sequence. In the scenario under consideration, there are no native-language transcriptions or pronunciation dictionary available in the test language; instead, available resources include non-native transcriptions, a rudimentary rule-based G2P, and a list of orthographic word forms mined from the internet. The proposed solution post-processes non-native transcriptions by converting them to test-language orthography, composing with testlanguage word forms, then converting back to a phone string. Experiments demonstrate that the phone error rate of the transcription is reduced, using this method, by 22% on an independent evaluation-test dataset.

Original languageEnglish (US)
Pages (from-to)30-36
Number of pages7
JournalProcedia Computer Science
Volume81
DOIs
StatePublished - Jan 1 2016
Event5th Workshop on Spoken Language Technologies for Under-resourced languages, SLTU 2016 - Yogyakarta, Indonesia
Duration: May 9 2016May 12 2016

Fingerprint

Transcription
Glossaries
Internet
Experiments

Keywords

  • G2P
  • automatic speech recognition resources
  • mismatched crowdsourcing
  • probabilistic transcription

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints. / Kong, Xiang; Jyothi, Preethi; Hasegawa-Johnson, Mark Allan.

In: Procedia Computer Science, Vol. 81, 01.01.2016, p. 30-36.

Research output: Contribution to journalConference article

@article{29129a1f7b9d48bca3dbb522cd0cdd22,
title = "Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints",
abstract = "This article describes a method for reducing the error rate of probabilistic phone-based transcriptions resulting from mismatched crowdsourcing by using language-specific constraints to post-process the phone sequence. In the scenario under consideration, there are no native-language transcriptions or pronunciation dictionary available in the test language; instead, available resources include non-native transcriptions, a rudimentary rule-based G2P, and a list of orthographic word forms mined from the internet. The proposed solution post-processes non-native transcriptions by converting them to test-language orthography, composing with testlanguage word forms, then converting back to a phone string. Experiments demonstrate that the phone error rate of the transcription is reduced, using this method, by 22{\%} on an independent evaluation-test dataset.",
keywords = "G2P, automatic speech recognition resources, mismatched crowdsourcing, probabilistic transcription",
author = "Xiang Kong and Preethi Jyothi and Hasegawa-Johnson, {Mark Allan}",
year = "2016",
month = "1",
day = "1",
doi = "10.1016/j.procs.2016.04.026",
language = "English (US)",
volume = "81",
pages = "30--36",
journal = "Procedia Computer Science",
issn = "1877-0509",
publisher = "Elsevier BV",

}

TY - JOUR

T1 - Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints

AU - Kong, Xiang

AU - Jyothi, Preethi

AU - Hasegawa-Johnson, Mark Allan

PY - 2016/1/1

Y1 - 2016/1/1

N2 - This article describes a method for reducing the error rate of probabilistic phone-based transcriptions resulting from mismatched crowdsourcing by using language-specific constraints to post-process the phone sequence. In the scenario under consideration, there are no native-language transcriptions or pronunciation dictionary available in the test language; instead, available resources include non-native transcriptions, a rudimentary rule-based G2P, and a list of orthographic word forms mined from the internet. The proposed solution post-processes non-native transcriptions by converting them to test-language orthography, composing with testlanguage word forms, then converting back to a phone string. Experiments demonstrate that the phone error rate of the transcription is reduced, using this method, by 22% on an independent evaluation-test dataset.

AB - This article describes a method for reducing the error rate of probabilistic phone-based transcriptions resulting from mismatched crowdsourcing by using language-specific constraints to post-process the phone sequence. In the scenario under consideration, there are no native-language transcriptions or pronunciation dictionary available in the test language; instead, available resources include non-native transcriptions, a rudimentary rule-based G2P, and a list of orthographic word forms mined from the internet. The proposed solution post-processes non-native transcriptions by converting them to test-language orthography, composing with testlanguage word forms, then converting back to a phone string. Experiments demonstrate that the phone error rate of the transcription is reduced, using this method, by 22% on an independent evaluation-test dataset.

KW - G2P

KW - automatic speech recognition resources

KW - mismatched crowdsourcing

KW - probabilistic transcription

UR - http://www.scopus.com/inward/record.url?scp=84976407396&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84976407396&partnerID=8YFLogxK

U2 - 10.1016/j.procs.2016.04.026

DO - 10.1016/j.procs.2016.04.026

M3 - Conference article

AN - SCOPUS:84976407396

VL - 81

SP - 30

EP - 36

JO - Procedia Computer Science

JF - Procedia Computer Science

SN - 1877-0509

ER -