in Language Technology, Technology, Translation

When you enter a piece of text in Google Translate using the ‘Detect language’ feature, an algorithm able to recognize the belonging of specific words and patterns to various world languages selects one of many linguistic frameworks at its disposalthen begins the more linear natural language processing (NLP) task of translating your content into the target languageAs magical as that is, it is easy enough to recognize that Google Translate is a multilingual whiz only for this first step of cuing text for translation, after which it is essentially bilingual, working language pair by language pair to go back and forth between two given languages. What if AI could be truly multilingual, though, with the ability to make assorted connections spanning a broad knowledge of the world sourced from hundreds of languages? What if, like multilingual people, AI could hear you say something in English and think of something it knows by way of Spanish, then tell you about it?  

Now, it appears that may not be particularly farfetched, following reports that Google researchers have applied what is known as a language-agnostic knowledge base in the NLP process called entity linking, in which entities like words are linked to attributable facts or information about them. The fact that this knowledge base is language-agnostic means that when it receives an input entity in English, the algorithm at work will not only look for references to it in English, but rather for meaningful references to its semantic equivalents in other languages. The key entity in question seems not to be a token of language, but rather the essential information that we try to carry between languages when translating text and speech. 

So far, it is difficult to say how this research could transition into real world practice, but at least one service that it likely benefits is search, enhancing the diversity of information that engines can learn to apply to generating better results and user experiences. Most generally, advances like these seem to bode well for the globalization of information in an age that often finds us at a knowledge deficit when it comes to tackling challenges of business or even global events like the 2020 pandemic. As we have learned, it is not always enough for information simply to exist on record, and with AI as one of many tools that people have sought to utilize in seeking patterns and insights in global data, it is compelling to imagine one that can understand relationships in information spanning multiple languages – in real terms, diverse global sources.  

Looking for analogies within language services, the concept of a knowledge base sounds remarkably similar to existing concepts from machine translation and terminology management, despite hailing from a fairly separate realm of the (R&D) world. Without reaching too far from the current limits of machine translation, advancing AI may have implications for improving processes for TM and glossary management, for instance by proposing possible word matches when entering a new language. When multiple possible choices are possible, questions of style and culture can often prove decisive in selecting standard terms and definitions in a TM Bank, and software that can comprehensively reference meaning across languages is more likely to assist in these processes than software that simply sees words.  

As fascinating new possibilities emerge in NLP and other AI fields, CSOFT’s technology-driven translation processes continue to incorporate industry-leading tools and processes to ensure the quality and efficiency of translation projects. Learn more about our translation technologies and global network of linguists, subject matter experts, and engineers at csoftintl.com!  

 

 

Related:  From Reddit Boards to the Local Clerk’s Office: A Week of Unexpected Appearances from AI

Leave a Comment

Comment