DeepL, a language translation service from the founders of German-owned Linguee, is claimed to outperform Google Translate. The DeepL system runs on one of the world’s largest supercomputers, located in Iceland, and can translate a million words in under a second. The DeepL team claims that human translators preferred DeepL in a blind test by a factor of 3:1 when compared with similar, competing systems from Google, Microsoft, and Facebook.
I decided to run my own experiment and test the capabilities of the new tool. For the purpose of the experiment, I picked a non-fictional text from the European Personnel Selection Office (EPSO) sample tests for verbal reasoning.
Let the battle begin
What follows are the original French source text and the two machine translated target texts:
French source text
Des scientifiques canadiens ont recréé une variante du virus de la grippe qui a décimé jusqu’à 50 millions de personnes lors de l’épidémie de «grippe espagnole» de 1918. Malgré le nombre impressionnant de victimes, les médecins de l’époque n’avaient aucun moyen de conserver des échantillons de tissus de patients infectés et la nature létale du virus n’avait donc jamais pu être totalement comprise. Les scientifiques ont cependant pu extraire du matériel génétique sur le corps congelé d’une victime de la grippe ensevelie dans le sol constamment gelé de l’Alaska. Ils ont utilisé ce matériel pour reconstituer un virus parfaitement opérationnel. Le virus s’est révélé létal lorsqu’il a été testé sur des singes, la plupart des animaux infectés ayant succombé après quelques jours.
Google target text
Canadian scientists have recreated a variant of the influenza virus that decimated up to 50 million people during the “Spanish flu” epidemic of 1918. Despite the impressive number of casualties, doctors of the time did not had no way of keeping tissue samples from infected patients and therefore the lethal nature of the virus could never be fully understood. Scientists, however, were able to extract genetic material from the frozen body of a flu victim buried in the permanently frozen soil of Alaska. They used this material to replenish a perfectly functioning virus. The virus was lethal when tested in monkeys, with most infected animals succumbing after a few days.
DeepL target text
Canadian scientists have recreated a variant of the flu virus that killed up to 50 million people in the 1918 “Spanish flu” epidemic. Despite the overwhelming number of casualties, physicians at the time had no way to store tissue samples of infected patients, so the lethal nature of the virus could never be fully understood. However, scientists were able to extract genetic material from the frozen body of an influenza victim buried in the constantly frozen soil of Alaska. They used this equipment to reconstruct a perfectly functioning virus. The virus was lethal when tested on monkeys, with most infected animals dying within a few days.
DeepL translation seems more nuanced and accurate
What we can see from the experimental results is that Google Translate went for the more literal translation, while DeepL tried to find synonyms as not to lose certain nuances, and this difference eventually made for a more natural translation. In addition, Google Translate managed to slip in a tense error, with “did not had no way”, while DeepL did not make the same mistake.
This is by no means a comprehensive experiment, and no definite claims can be made as to the relative proficiency of both systems; however, within the scope of our experiment DeepL outperformed Google Translate.The newcomer therefore clearly has an important role to play in the machine translation landscape.
Have a look at other TCLoc articles on machine translation: Pixel Buds: The End of Interpreters? and Natural Language Processing: A Difficult Task for Machine Learning.
Marten Hofstede | Feb 19, 2018 8:00
Another interesting test would be to have both translation services translate their own and each other’s translation back into the original language.
Colin Mclarty | Feb 19, 2018 8:00
Very nice. Thanks for posting this. Just a thought as a translator myself. I see one advantage to Google just because it is a bit more literal: For example I do not consider Google’s “decimated up to 50 million people” perfectly good English even though it matches the French word for word. So I would focus at that point. DeepL’s ” killed up to 50 million people” might not draw my attention—but it is substantially less vivid than the French. If I was using machine translation to rush out a large job I think I’d prefer the more literal Google output.