Home Money Inside the Creation of the World’s Most Powerful Open Source AI Model

Inside the Creation of the World’s Most Powerful Open Source AI Model

0 comments
Four people stand at the corner of a gray and yellow wall in an office room

Last Monday, there were about a dozen engineers and executives at a data science and AI company Databricks gathered in meeting rooms connected via Zoom to discover whether they had succeeded in building a top language model for artificial intelligence. The team had spent months and about $10 million training DBRX, a large language model similar in design to the one behind OpenAI’s ChatGPT. But they wouldn’t know how powerful their creation was until the results came back from the final tests of its abilities.

“We’ve surpassed everything,” Jonathan Frankle, chief neural network architect at Databricks and leader of the team that built DBRX, eventually told the team, who responded with whoops, cheers and applause emojis. Frankle usually avoids caffeine, but sipped iced lattes after pulling an all-nighter to write down the results.

Databricks will release DBRX under an open source license, allowing others to build on top of its work. Frankle shared data showing that in a dozen benchmarks measuring the AI ​​model’s ability to answer general knowledge questions, perform reading comprehension tasks, solve tricky logic puzzles, and generate high-quality code, DBRX was better than any other available open source model.

AI decision makers: Jonathan Frankle, Naveen Rao, Ali Ghodsi and Hanlin Tang.Photo: Gabriela Hasbun

It surpassed Meta’s Llama 2 and Mistral’s Mixtral, two of the most popular open source AI models available today. “Yes!” shouted Ali Ghodsi, CEO of Databricks, as the scores appeared. “Wait, did we beat Elon’s thing?” Frankle responded that they had indeed surpassed the Grok AI model recently open sourced by Musk’s xAI, adding, “I’ll consider it a success if we get a mean tweet from him.”

To the team’s surprise, DBRX also came alarmingly close at several points to GPT-4, OpenAI’s closed model that powers ChatGPT and is widely considered the pinnacle of machine intelligence. “We’ve created a new state of affairs for open source LLMs,” Frankle said with a huge grin.

Build blocks

Through open-sourcing, DBRX Databricks adds even more momentum to a movement that is challenging the secretive approach of the most prominent companies in the current generative AI boom. OpenAI and Google are keeping a close eye on the code for their GPT-4 and Gemini major language models, but some rivals, most notably Meta, have released their models for others to use, arguing that this will boost innovation by putting the technology in the hands of more researchers, entrepreneurs, startups and established companies.

Databricks says it also wants to be open about the work involved in creating its open source model, something Meta has not done for some key details about the creation of its Llama 2 model. The company will publish a blog post detailing the work required to create the model, and is also inviting WIRED to spend time with Databricks engineers as they make key decisions during the final stages of the multimillion-dollar training process from DBRX. That gave a sense of how complex and challenging it is to build a leading AI model, but also how recent innovations in this area promise to drive down costs. That, combined with the availability of open source models like DBRX, suggests that AI development isn’t slowing down anytime soon.

Ali Farhadi, CEO of the Allen Institute for AI, says greater transparency around building and training AI models is desperately needed. The field has become increasingly secretive in recent years as companies sought an edge over the competition. Opacity is especially important when there are concerns about the risks advanced AI models may pose, he says. “I am very happy that there is some attempt at openness,” says Farhadi. “I believe that a significant part of the market will move towards open models. We need more of this.”

You may also like