Machine translation (MT) doesn’t have the best reputation. Case in point: Google Translate is one of MT’s biggest success stories, but also probably the best-known perpetrator of mistranslations. These days, our first impulse is usually to blame Google Translate for translations like this:
Lost in Translation?
Has everything been lost in translation? Or is progress being made in the machine translation industry?
Google Translate was launched in 2006 as a competitor to free translation services such as BabelFish. Today it has over 500 million users, works across 103 languages, and translates more than 100 billion words every day. Despite its flaws, it provides an invaluable linguistic bridge. For example, during the global refugee crisis, the platform saw a fivefold increase in translations between Arabic and German.
In November 2016, Google made headlines by changing the technology behind its language platform from statistical machine translation (SMT) to neural machine translation (NMT). SMT uses statistical methods and works by detecting patterns in hundreds of millions of documents that have previously been translated by humans, whereas NMT uses “deep learning” to improve accuracy. Deep learning is a branch of machine learning based on a set of algorithms that try to model abstractions in data. After the change, there was an immediate improvement for some language pairs, such as English to Japanese. As a result, Google Translate became a trending topic on Japanese Twitter due to its fairly accurate translations.
Cat and Mouse
Since 2011, Google has had a research arm called Google Brain, which focuses on artificial intelligence technology and in particular on neural networks. Neural networks are digital networks intended to resemble the brain and which are able to learn from experience in a similar manner to the human brain.
Google Brain first achieved fame in 2012 with the publication of their cat paper. This research paper showed how neural networks roughly a billion neurons in size (many times smaller than a human brain, but larger than that of a mouse) had learned to recognize cats, not with programming but through being exposed to images of cats until it learned to recognize one. This same deep learning, cat recognizing technology now powers Google Translate.
Statistical machine translation is often flummoxed by contextual language. Neural machine translation is better at translating full sentences and learning from context and experience.
Translating the Future
The progress machine translation has made in the past decade is inspiring. But you don’t have to plug too many sentences into the new Google Translate before you realize there’s still some deep learning left to do.
Language is at the heart of being human, and as we know, machines are heartless. Alan Turing, a famous British computer scientist, wrote in 1950 that a benchmark of successful artificial intelligence might be if, throughout a five-minute conversation, a computer could convince a human that the computer was human too. This concept is now called the “Turing test.”
To pass this test, a computer would not only need to understand language, but also be sensitive to its nuances and subtleties; essential skills when it comes to translation. Even though a computer program called Eugene Goostman was reported to have passed the Turing test in 2014, human translators will continue to be essential to quality translation for a very long time to come.
For the time being, the companies benefiting the most from machine translation are those which are developing the technology behind it. However, localization companies like CSOFT can leverage these technologies to improve our translators’ efficiency and workflow. For now, machine translation alone is not the future of translation – it is, rather, a part of the toolkit we use to provide our customers with industry-leading translation and localization services.
Written by Joseph O’Neill – Technical Writer at CSOFT International