Customize Consent Preferences

We use cookies to help you navigate efficiently and perform certain functions. You will find detailed information about all cookies under each consent category below.

The cookies that are categorized as "Necessary" are stored on your browser as they are essential for enabling the basic functionalities of the site. ... 

Always Active

Necessary cookies are required to enable the basic features of this site, such as providing secure log-in or adjusting your consent preferences. These cookies do not store any personally identifiable data.

No cookies to display.

Functional cookies help perform certain functionalities like sharing the content of the website on social media platforms, collecting feedback, and other third-party features.

No cookies to display.

Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics such as the number of visitors, bounce rate, traffic source, etc.

No cookies to display.

Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.

No cookies to display.

Advertisement cookies are used to provide visitors with customized advertisements based on the pages you visited previously and to analyze the effectiveness of the ad campaigns.

No cookies to display.

Core42 sets new benchmark for Arabic large language models with release of Jais 30B

Core42, a G42 company and the UAE-based national-scale enabler for cloud and generative AI, announced the launch of Jais 30B, the newest and most proficient version of its open-source Arabic Large Language Model (LLM).

Featuring 30 billion parameters, this new iteration of Jais follows the release in August 2023 of the 13 billion parameter model, underscoring Core42’s commitment to provide a rich linguistic and culture-focused generative AI experience for the over 400 million Arabic speakers worldwide.

Jais, born from the collaboration between Inception, now converged into Core42; Mohamed bin Zayed University of Artificial Intelligence (MBZUAI); and Cerebras Systems, immediately set a benchmark in the Arabic LLM landscape.

The model was trained on the Condor Galaxy-1 (CG-1) – one of the world’s fastest AI supercomputers, with four exaFLOPS of training compute, 54 million cores, and 64-nodes – built by G42 in partnership with Cerebras Systems. Jais 13B went from concept to fine-tuned, leading open-source model in less than four months. Notably, the production training run for Jais 13B was completed in 21 days on CG-1.

The new Jais 30B model was trained on a substantially larger dataset than its predecessor, made of 126 billion Arabic tokens, 251 billion English tokens, and 50 billion code tokens and shows an increased performance across all key indicators. It offers 160% longer and more detailed answers in Arabic and a 233% increase in English, reflecting significant improvements in language generation.

The model also presents better performance in summarisation (53% in Arabic and 85% in English) and formatting (130% in Arabic and 134% in English). Jais 30B performance is now on par with monolingual English models, outperforming most open-source models in Foundation Model evaluations.

Jais 30 B’s enhancements have been tested and validated using heuristic, cross-model comparison, and human evaluations, showing that the responses of the model’s fine-tuned iterations outperform those of Jais 13B 96% of the time in Arabic and 97% in English.

Reaffirming its dedication to responsible and safe AI practices, the developing team has also further enhanced its processes and policies to guardrail biases and the production of hateful or harmful content by the model, a process made easier by its open-source release.

Jais’s versatility and unique capabilities in the Arabic language domain have already shown promise in applications across various sectors including telecommunications, energy, education, healthcare and innovative solutions for the marketing communications industry.

Dr. Andrew Jackson, EVP, Chief AI Officer of Core42, said, “The launch of Jais 30B marks another significant milestone for Core42 and represents a giant leap forward for the Arabic-speaking world in harnessing the potential of generative AI. This release underscores the powerful synergy between Core42’s technological leadership, our extensive partner ecosystem, and our shared dedication to pushing the boundaries of what’s possible in AI. I eagerly anticipate close collaboration with our customers and partners to explore new applications and continually enhance the model’s capabilities as we intensify our efforts to create top-quality LLMs for various other languages.”

Andrew Feldman, CEO and co-founder of Cerebras Systems, said, “Less than eight weeks after we introduced Jais 13B to the global Arabic-speaking community, the Core42 and Cerebras teams have delivered a new state-of-the-art LLM that is more than double in size. Jais 30B leverages the incredible, massive compute of Condor Galaxy 1 to set another record in bilingual performance and impressively fast training time.”