The Aya Project

The Aya Project

A C4AI Community Project

Cohere For AI

Aya: An Open Science Initiative to Accelerate Multilingual AI Progress

Our goal is to accelerate NLP breakthroughs for the rest of the world’s languages through open science collaboration.

How it Works?

Rate Model Performance

You will be asked to rate and edit model data to improve it.

Contribute Your Language

You can share your own examples of data that you think is great.

Review User Feedback

Audit the work of other contributors to ensure quality and consistency.

What is Aya?

Recent breakthroughs in NLP technology have focused on English, leaving other languages behind. One of the biggest hurdles to improving multilingual model performance is access to high-quality examples of multilingual text. In January 2023 the Cohere For AI community set out on an ambitious open science research project.

With members from over 100 countries around the world, we sought to leverage the great strength of our diversity to make meaningful contributions to fundamental machine-learning questions. Our ultimate goal is to release a high-quality multilingual dataset. In sharing this artifact broadly, we will support future projects that aim to build technology for everyone, including those who use any of the 7,000+ languages spoken around the world. As technological progress races forward, join us to ensure no language is left behind.

What does the word Aya mean? The word Aya has its origins in the Twi language, and is translated to “fern”. Aya is a symbol of endurance, resourcefulness and defiance. Similar to our initiatives names, we believe it is a long term effort of endurance to make sure that no language is left behind.

Made by the C4AI Open Science Community.

The Aya Project

The Aya Project