Abstract
Multi-parallel corpora provide a potentially rich resource for machine translation. This paper surveys existing methods for utilizing such resources, including hypothesis ranking and system combination techniques. We find that despite significant research into system combination, relatively little is know about how best to translate when multiple parallel source languages are available. We provide results to show that the MAX multilingual multi-source hypothesis ranking method presented by Och and Ney (2001) does not reliably improve translation quality when a broad range of language pairs are considered. We also show that the PROD multilingual multi-source hypothesis ranking method of Och and Ney (2001) cannot be used with standard phrase-based translation engines, due to a high number of unreachable hypotheses. Finally, we present an oracle experiment which shows that current hypothesis ranking methods fall far short of the best results reachable via sentence-level ranking.
Original language | English (US) |
---|---|
State | Published - 2008 |
Externally published | Yes |
Event | 8th Biennial Conference of the Association for Machine Translation in the Americas, AMTA 2008 - Waikiki, HI, United States Duration: Oct 21 2008 → Oct 25 2008 |
Other
Other | 8th Biennial Conference of the Association for Machine Translation in the Americas, AMTA 2008 |
---|---|
Country/Territory | United States |
City | Waikiki, HI |
Period | 10/21/08 → 10/25/08 |
ASJC Scopus subject areas
- Language and Linguistics
- Human-Computer Interaction
- Software