Select your language

Join SUC Blog!

Blog

Meta unveils AI Model enabling translation across nearly 100 languages

Meta unveils AI Model enabling translation across nearly 100 languages

With a global count of more than 7,000 languages, the average person is skilled in at least two linguistic systems. Generally, this encompasses one's mother tongue and another one acquired through formal learning. The complex web of languages functions as both a connection and a hindrance, enabling or obstructing comprehension amid various cultures, individuals, and communities. While the aspiration to become a polyglot is widespread, the unrealistic nature of mastering all 7,000 languages has led us to explore technological remedies.

Meta's Breakthrough: SeamlessM4T Multilingual Model

Addressing this global language divide, Meta, a prominent tech company, has introduced the remarkable SeamlessM4T multilingual model. This innovation is poised to revolutionize text and speech translation and transcription capabilities. Seamlessly performing five distinct functions—speech-to-text, speech-to-speech, text-to-speech, text-to-text translations, and speech recognition—SeamlessM4T marks a significant stride towards fostering interconnectedness among diverse linguistic communities.

While the model excels at speech recognition and translation for nearly 100 input languages and 35 output languages, it represents an incremental advance towards bridging gaps among various linguistic groups. For instance, inputting "Good morning" in English produces the output "Bonjour" when French is selected, underscoring the potential of this technology to unite and facilitate cross-cultural communication.

Meta's Perspective

Meta acknowledges the contemporary world's unprecedented interconnectedness, granting access to an abundance of multilingual content. This heightened connectivity underscores the growing importance of comprehending information across languages, an aspect that SeamlessM4T aptly addresses.

SeamlessM4T holds particular promise for language learners and travelers in unfamiliar territories. Its utility extends to those aiming to acquire new languages or navigate countries where language barriers might otherwise hinder communication.

Aligned with its open-source ethos, Meta has made the models underpinning SeamlessM4T available on HuggingFace—a platform fostering knowledge exchange in the realm of machine learning. Comprising two checkpoints of differing sizes, SeamlessM4T-Medium and SeamlessM4T-Large, the open availability encourages developers and researchers to build upon this foundational work. 

Meta has also made public the dataset underpinning the training of SeamlessM4T—dubbed SeamlessAlign. This colossal multimodal translation dataset constitutes a milestone, comprising a staggering 270,000 hours of meticulously curated speech and text alignments.

SeamlessM4T draws from Meta's previous models, including No Language Left Behind (NLLB), which supports 200 languages in text-to-text translation, and Universal Speech Translator, the pioneering direct speech-to-speech translation system for Hokkien, an oral language prevalent within the Chinese diaspora. Additionally, Meta's Massively Multilingual Speech model stands as a testament, identifying over 4,000 spoken languages while offering speech recognition, language identification, and speech synthesis across 1,100 languages.

Towards a Universal Language Translator

In the quest for a universal language translator, Google has been at the forefront, offering tools for article translation and speech conversion. Google's endeavor continues with the development of the Universal Speech Model (USM), aimed at supporting languages spoken by smaller populations. This AI-powered model targets 1,000 languages, boasting 2 billion parameters trained on an extensive corpus of 12 million hours of speech and 28 billion text sentences. This initiative not only enriches YouTube's automatic speech recognition software for real-time subtitles but also exemplifies the industry's pursuit of multilingual inclusivity.

While SeamlessM4T encompasses only a fraction of the world's languages, it acts as a stepping stone toward the ultimate goal of a universal language translator. Other advancements in this domain include OpenAI's ChatGPT, proficient in 95 languages, and Google's Bard, capable of conversing in 40 languages. Acknowledging the rapid pace of technological development in the realms of artificial intelligence and generative AI, it is evident that the path to effortless and comprehensive translation across all languages remains a formidable yet promising journey.

logo.png
Main Campus

Open on location Google Map

Professional Courses