When OpenAI announced GPT-4, its latest large language model, last March, it sent shockwaves through the tech world. It was clearly more capable than anything seen before at chatting, coding, and solving all kinds of thorny problems, including schoolwork.
anthropic, a rival to OpenAI, announced today that it has made its own AI breakthrough that will refresh chatbots and other use cases. But while the new model is the best in the world by some measures, it’s more of a step forward than a giant leap.
Anthropic’s new model, called Claude 3.5 Sonnet, is an update to its current Claude 3 family of AI models. It is more adept at solving math, coding, and logic problems as measured by commonly used benchmarks. Anthropic says he’s also much faster, understands the nuances of language better, and even has a better sense of humor.
This is certainly useful for people trying to build apps and services on top of Anthropic’s AI models. But the company’s news is also a reminder that the world is still waiting for another leap forward in AI similar to the one provided by GPT-4.
Anticipation has been building for OpenAI to release a sequel called GPT-5 for over a year, and the company’s CEO Sam Altman has encouraged speculation which will mean another revolution in AI capabilities. GPT-4 cost more than $100 million to train, and GPT-5 is expected to be much larger and more expensive.
Although OpenAI, Google and other AI developers have released new models that surpass GPT-4, the world is still waiting for the next big leap. Lately, progress in AI has become more incremental and more dependent on innovations in model design and training rather than brute force scaling of model size and calculation, as GPT-4 did.
Michael Gerstenhaber, Anthropic’s chief product officer, says the company’s new Claude 3.5 Sonnet model is larger than its predecessor, but it gains much of its new power from training innovations. For example, the model received feedback designed to improve its logical reasoning skills.
Anthropic says Claude 3.5 Sonnet outperforms top models from OpenAI, Google, and Facebook in popular AI benchmarks including GPQAa graduate-level test of experience in biology, physics, and chemistry; MMLU, a test that covers computer science, history and other topics; and human evaluation, a measure of coding proficiency. However, improvements are a matter of a few percentage points.
This Latest Advance in AI May Not Be Revolutionary, But It’s Fast: Just Anthropic Announced its previous generation of models three months ago. “If you look at the pace of change in intelligence, you’ll appreciate how fast we’re moving,” says Gerstenhaber.
More than a year after GPT-4 sparked a frenzy of new investment in AI, it may be more difficult to produce breakthroughs in artificial intelligence. With GPT-4 and similar models trained on large amounts of text, images and online videos, it is becoming increasingly difficult to find new data sources to feed machine learning algorithms. Making models substantially larger, so they have more learning capacity, is expected to cost billions of dollars. When OpenAI announced its recent update last month, with a model that has visual and voice capabilities called GPT-4o, the focus was on a more natural, human interface rather than substantially smarter problem-solving capabilities.