Detecting Failures of Neural Machine Translation in the Absence of Reference Translations

Wenyu Wang, Wujie Zheng, Dian Liu, Changrong Zhang, Qinsong Zeng, Yuetang Deng, Wei Yang, Pinjia He, Tao Xie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Despite getting widely adopted recently, a Neural Machine Translation (NMT) system is often found to produce translation failures in the outputs. Developers have been relying on in-house system testing for quality assurance of NMT. This testing methodology requires human-constructed reference translations as the ground truth (test oracle) for example natural language inputs. The testing methodology has shown benefits of quickly enhancing an NMT system in early development stages. However, in industrial settings, it is desirable to detect translation failures without reliance on reference translations for enabling further improvements on translation quality in both industrial development and production environments. Aiming for a practical and scalable solution to such demand in the industrial settings, in this paper, we propose a new approach for automatically identifying translation failures without requiring reference translations for a translation task. Our approach focuses on a property of natural language translation that can be checked systematically by using information from both the test inputs (i.e., the texts to be translated) and the test outputs (i.e., the translations under inspection) of the NMT system. Our evaluation conducted on real-world datasets shows that our approach can effectively detect property violations as translation failures. By deploying our approach in the translation service of WeChat (a messenger app with more than one billion monthly active users), we show that our approach is both practical and scalable in the industrial settings.

Original languageEnglish (US)
Title of host publicationProceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - DSN 2019 Industry Track
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1-4
Number of pages4
ISBN (Electronic)9781728130323
DOIs
StatePublished - Jun 2019
Event49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Industry Track, DSN-Industry Track 2019 - Portland, United States
Duration: Jun 24 2019Jun 27 2019

Publication series

NameProceedings - 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - DSN 2019 Industry Track

Conference

Conference49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks - Industry Track, DSN-Industry Track 2019
CountryUnited States
CityPortland
Period6/24/196/27/19

Keywords

  • ML quality assurance
  • failure detection
  • neural machine translation

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems and Management
  • Safety, Risk, Reliability and Quality
  • Industrial and Manufacturing Engineering

Fingerprint Dive into the research topics of 'Detecting Failures of Neural Machine Translation in the Absence of Reference Translations'. Together they form a unique fingerprint.

Cite this