2022 saw the arrival of the world’s first multilingual large language model, but what does 2023 hold in store for language AI?
If you paused over the above statement, you may be among the many who, though closely watching language AI news, missed last year’s light coverage of the answer to when algorithms like GPT-3 would become fluent – and literate – in multiple languages. Not Meta or Google, impressive as their own developments were, but startup Hugging Face managed to launch a 176-billion parameter, publicly available multilingual text generation algorithm called BLOOM last July. While perhaps overshadowed by reports of those larger two tech giants pursuing instant universal speech translation for their platforms, it marked an unprecedented leap forward in linguistic AI’s availability across languages, while also reemphasizing the limitations of these approaches to development. Far from a ready replacement for human translation and interpretation, the multilingual conversant proved liable toward biases and errors, whether answering users’ questions from its knowledge of Spanish, Arabic, or numerous other languages. In other words, any technology intending to help with real problems in multiple languages will continue to need professional localization support – not an algorithm to communicate for it.
With the same issues of quality persisting for monolingual models, 2023 opens to renewed controversy around the perennial leader in shock value now released as ChatGPT. This time, though, it is a story of more than just resurfaced prejudice from a bot trained on toxic internet chat data. Rather, the trouble seems to be that it is available in the hands of students using it to write their papers for them, and better enough at saying nice things that teachers are unlikely to be the wiser. Innovation in language AI appears to be accelerating if only seen through the lens of the ensuing response. Almost as fast as Elon Musk was able to offer insights on the coming end of homework, reports arrived of a counter-algorithm from Princeton student Edward Tian that can tell human written content apart from AI generated outputs.
Meanwhile, debate continues as to whether a chatbot can replace human writers in digital channels, or to what extent. Where ChatGPT leads the language AI field in 2023 remains to be seen, but as in past years, automation is far more likely to rearrange the technological landscape in dynamic ways than it is to replace the crucial role of human linguists in communications. Marketing communications, for example, may be primed for significant automation for well-defined brands, but there are no such multilingual assurances for companies looking to expand in multiple languages at once – a classic case for transcreation, or creative translation driven by human linguists and digital marketing specialists. Meanwhile, BLOOM is at least as far from providing sound answers to questions as monolingual models like GPT-3, meaning that the localization needs and challenges of bringing powerful new data tools to users in new languages are only going to become more elaborate.
For companies growing their innovations across languages CSOFT offers a full range of technology-driven translation solutions in over 250 languages. To learn more about how we help companies localize for world markets and improve their multilingual communications, visit us at csoftintl.com.[dqr_code size="120" bgcolor="#fff"]