Meta's No Language Left Behind project

Meta (the parent company of Facebook, WhatsApp, and Instagram) just announced their No Language Left Behind project, where they developed a machine translation model that supports 200 languages (NLLB-200) and released it as open-source.

Wikimedia Foundation’s Language team has collaborated with the research team at Meta to provide support for 20+ underserved languages in Content Translation supporting editors to translate Wikipedia articles. This was done using an external API (Flores) that the Meta research team provided for a smaller list of languages listed here. This is the first time editors in 13 wikis such as Luganda have support for machine translation.

An immediate example of NLLB’s impact is the development of an Indonesian language translator by Raphaël Merx that could translate from one Indonesian language to the other (currently supporting seven local Indonesian languages that has an edition of Wikipedia, in addition to the Indonesian langugae itself). Indonesian Wikimedians has been informed of this tool and are quite excited to try it out to help their work at the Wikimedia projects.

What do you think of this project? How do you see it could help your work in the Wikimedia projects? Do you have any thoughts, questions, or concerns?

1 Like

In what seem a separate but related project, Meta also announced Sphere, an “AI built around the concept of tapping the vast repository of information on the open web to provide a knowledge base for AI and other systems to work”, and will be used in Wikipedia in a “production phase (not live entries) to automatically scan entries and identify when citations in its entries are strongly or weakly supported.”