Meta's 'Massively Multilingual' AI Model Translates Up To 100 Languages, Speech...

Meta's 'Massively Multilingual' AI Model Translates Up To 100 Languages, Speech or Textbinspam dupe notthebest offtopic slownewsday stale stupid fresh funny insightful interesting maybe offtopic flamebait troll redundant overrated insightful interesting informative funny underrated descriptive typo dupe error

Sign up for the Slashdot newsletter! OR check out the new Slashdot job board to browse remote jobs or jobs in your area

Do you develop on GitHub? You can keep using GitHub but automatically sync your GitHub releases to SourceForge quickly and easily with this tool so your projects have a backup location, and get your project in front of SourceForge's nearly 30 million monthly users. It takes less than a minute. Get new users downloading your project releases today!

An anonymous reader quotes a report from Ars Technica: On Tuesday, Meta announced SeamlessM4T, a multimodal AI model for speech and text translations. As a neural network that can process both text and audio, it can perform text-to-speech, speech-to-text, speech-to-speech, and text-to-text translations for "up to 100 languages," according to Meta. Its goal is to help people who speak different languages communicate with each other more effectively. Continuing Meta's relatively open approach to AI, Meta is releasing SeamlessM4T under a research license (CC BY-NC 4.0) that allows developers to build on the work. They're also releasing SeamlessAlign, which Meta calls "the biggest open multimodal translation dataset to date, totaling 270,000 hours of mined speech and text alignments." That will likely kick-start the training of future translation AI models from other researchers.

Among the features of SeamlessM4T touted on Meta's promotional blog, the company says that the model can perform speech recognition (you give it audio of speech, and it converts it to text), speech-to-text translation (it translates spoken audio to a different language in text), speech-to-speech translation (you feed it speech audio, and it outputs translated speech audio), text-to-text translation (similar to how Google Translate functions), and text-to-speech translation (feed it text and it will translate and speak it out in another language). Each of the text translation functions supports nearly 100 languages, and the speech output functions support about 36 output languages.

Recommend

Russian space agency confirms its LUNA-25 mission crashed on the lunar surface |...

Stable Diffusion一周年：这份扩散模型编年简史值得拥有

铅笔道助力“创客中国”大赛举办：主办单位为工信部、北京经信局

How VMware Private AI Foundation with Nvidia will help enterprises embrace gener...

Microsoft will sell streaming rights for Activision Blizzard games to Ubisoft in...

The 7th Guest VR Releases This October For Quest, PSVR 2 & PC VR

混合精度下位置编码竟有大坑，LLaMA等主流开源模型纷纷中招，百川智能给出修复方案

Onboarding: An Insight Of Data Review For Rehire (Rehire With Old Employment)

虎牙直播出现“限免”字样，或将试水按画质付费

红蓝出 CP？做出蓝刺猬索尼克的世嘉收购了《愤怒的小鸟》母公司

About Joyk