Machine Translation Limits in Localization Strategy
The limits of machine translation may be overlooked when you’re anxious to enter a market. MT can be a useful tool, but it’s not a total solution. As long as you’re prepared to use it as a localization aid and not a complete solution, you can enjoy its benefits while keeping standards high.MT and human translation aren’t mutually exclusive. MT is a productivity tool. It can help translators work faster and gain higher-quality results. In some cases, it’s even possible to automate significant portions of the product with an expertly created system. However, it’s rarely a standalone solution to localization, which is why it’s wise to understand its limitations.
Enjoying Automation Within the Limits of Machine Translation
The limits of machine translation often come hand-in-hand with limited data. MT is trained to use data to make suggestions for translators. To create an effective machine translation program, you’re going to need at least one million words of high-quality, human-translated content.
That’s the biggest problem most companies run into because they don’t have access to this training data, otherwise known as corpora or text corpus.Even if they do have data to train these engines, it’s often extremely subject-matter specific—like with technical documentation. That concise information is very factual and can automate the translation of many pieces of content by using and following patterns.
That’s why using MT is typically best in instances where the material is very low-value, in that it’s not viewed often and isn’t designed to have a lot of “personality.”On top of that, most people looking at MT as a method of automating translation do so because they want to save money. However, to train machine translation, they’d have to invest a lot in content creation and data mining to effectively build the system, which would kill any ROI from the project.
The budget is where many MT projects fall apart as they learn that creating the system would be more expensive than having human translators manage it.Finally, one of the most significant limits of machine translation comes from the human translators themselves. There was initially a lot of resistance to using MT when it first came out because of the concept of what’s “good enough.” Translators were told that they had only to “tweak” MT content to make it better but, often, were not given a benchmark of what qualified as good. On top of that, these tools reduced their pay rates and increased their productivity targets. As a result, any time you use MT and supplement it with post-editing, you’re going to want to set clear criteria and standards for optimal content.
Grading Machine Translation for Quality
Another limit of machine translation is a lack of standardization for determining quality. The bilingual evaluation understudy, or BLEU, is the most commonly accepted benchmark but suffers from some fatal flaws. As it’s using fixed reference standards, it’s less about measuring the improvement in quality and more about adhering to a rigid structure. As that’s the case, the BLEU score should never be the sole criterion for determining translation success.Translation standards for success should come from objective, human-centric criteria like:
- Did revenue go down or up following the use of the MT content?
- Were there fewer complaints from users?
- Did users navigate the software better and move further along?
- Did the edit distance—the time it took from machine translation to human edits to successful content—decrease?
The best metrics for understanding the limits of your machine translation are user-driven. When complaints go down, revenue goes up, and people navigate further into the software. In this case, it’s easy to see they’re understanding the content and engaging with it. Also, as the material is built out, you gain more access to data, which will improve training for the program. As long as this system is a tool, rather than an end-all solution, most MT efforts will be worth the investment despite their limitations.