If you’ve been playing around with language AI models like ChatGPT, you may have noticed that despite everything it can do, there are things you really wish it could do differently. Whether it’s the volume of content you are looking for it to help compose, difficulty with source citation, or even the fact that you can’t simply show it something without first describing it in words, there are plenty of limitations beyond its factual accuracy that would be nice to bypass. And if you’re accustomed to the usual pace of development, it would be easy to join those claiming that the buzz around GPT is soon to fade while awaiting future updates. Yet as reports this week compound, it appears many of the hard corners of working with language AI models are about to get much rounder, boding a second wave of the sweeping changes this new class of technology launched just weeks ago.
Over the past few days, announcements of a ChatGPT-4 successor to the original have heralded capabilities including the ability to process 25,000 words of text in a single prompt, with the backing of a deeper token count (i.e., context to reason with). Meanwhile, for those preferring the proverbial 1000 words’ worth of a picture, the levelled up GPT can accept visual graphics as inputs, as well as text. One reviewer notes the power this lends to process chart graphics for information without transcribing their contents, thus making PDFs and other non-editable formats no barrier to its powers of analysis at the word and figure level. Meanwhile, the same capabilities furnish a simple way to request image captions or copywriting for product images, and even impressions of what a person should make of something in an image they find unfamiliar. In content services like localization, this all presents fascinating new opportunities to scale production and work effortlessly across a variety of formats and file types, without inhibiting the quality of linguistic outputs. In all, it shows promise for making language service providers readier than ever to absorb the difficulties of consolidating businesses’ vast content footprints spanning disparate source document in order to cohesively localize products and services into a seamless multilingual profile.
Beyond OpenAI, news comes this week of developments from rival LLMs (large language models) like Microsoft’s proposed Visual ChatGPT, which would potentially output visual graphic content, as well as processing it. Perhaps more astonishing, a highly complex language AI has now demonstrated the uncanny ability to make concrete inferences about a string or just four emojis assumed to mean something. You can read here how it required just one shot to correctly answer, “Finding Nemo” when asked what movie the four described the plot of. When LLMs can translate meaning from one of the more ambiguous and emotive facets of people’s chat tendencies, it is not hard to imagine the vast potential it has to translate between languages and even levels of understanding in different languages.
With such creative possibilities gaining traction as real communication solutions for sectors of all stripes, it is more important than ever for businesses to work with innovative language service providers leveraging the most recent advances in language AI to deliver more affordable and dynamic solutions. With a global network of 10,000+ linguists and content production specialists serving 250+ languages, CSOFT can help accelerate growth in new markets by delivering fast, consistent, high-quality localization services for a full range of content areas and communication needs. To learn more about our technology-driven translation solutions, visit us at csoftintl.com.