Abstract
Extending phrase-based Statistical Machine Translation systems with a second, dynamic phrase table has been done for multiple purposes.
Promising results have been reported for hybrid or multi-engine machine translation, i.e.\ building a phrase table from the knowledge of external MT systems, and for online learning.
We argue that, in prior research, dynamic phrase tables are not scored optimally because they may be of small size, which makes the Maximum Likelihood Estimation of translation probabilities unreliable.
We propose basing the scores on frequencies from both the dynamic corpus and the primary corpus instead, and show that this modification significantly increases performance.
We also explore the combination of multi-engine MT and online learning.