Several years before ChatGPT started chattering, Google developed a very different artificial intelligence program called AlphaGo, which learned to play the board game Go with superhuman skill through tireless practice.
Researchers at the company have now published a paper that combines the capabilities of a large language model (the AI behind today’s chatbots) with those of AlphaZero, a successor to AlphaGo also capable of playing chess, to solve very complicated mathematical puzzles.
His new Frankensteinian creation, dubbed AlphaProof, has proven its prowess in tackling several problems of the year 2024. International Mathematical Olympiad (IMO), a prestigious competition for high school students.
AlphaProof uses Gemini’s large language model to convert naturally formulated mathematical questions into a programming language called LeanThis provides the training material for a second algorithm to learn, through trial and error, how to find proofs that can be confirmed as correct.
Earlier this year, Google DeepMind revealed another mathematical algorithm This is a new version of AlphaGeometry that combines a language model with a different approach to artificial intelligence. AlphaGeometry uses Gemini to convert geometry problems into a form that can be manipulated and tested by a program that handles geometric elements. Google also announced a new and improved version of AlphaGeometry today.
The researchers found that their two math programs could provide proofs for the IMO puzzles as well as a silver medalist would. The programs solved two algebra problems and one number theory problem out of a total of six. They solved one problem in minutes, but took several days to solve others. Google DeepMind has not disclosed how much computing power it used to solve the problems.
Google DeepMind calls the approach used for both AlphaProof and AlphaGeometry “neurosymbolic” because they combine pure machine learning from an artificial neural network, the technology underpinning most of the progress in AI in recent times, with conventional programming language.
“What we’ve seen here is that you can combine the approach that was so successful, and things like AlphaGo, with large language models and produce something that is extremely capable,” says David Silver, the Google DeepMind researcher who led the work on AlphaZero. Silver says the techniques demonstrated with AlphaProof should, in theory, extend to other areas of mathematics.
In fact, research raises the possibility of addressing the worst tendencies of large language models by applying logic and reasoning in a more grounded way. Miraculous as large language models may be, they often struggle to understand even basic mathematics or reason through problems logically.
In the future, the neural-symbolic method could provide a means for AI systems to convert questions or tasks into a form that can be reasoned upon in a way that produces reliable results. OpenAI is also rumored to be working on such a system. codenamed “Strawberry”.
The systems revealed today have a key limitation, though, as Silver acknowledges. Mathematical solutions are either right or wrong, allowing AlphaProof and AlphaGeometry to work their way toward the correct answer. Many real-world problems — for example, coming up with the ideal itinerary for a trip — have many possible solutions, and it may not be clear which one is ideal. Silver says the solution for more ambiguous questions may be for a language model to try to determine what constitutes a “correct” answer during training. “There’s a spectrum of different things you can try,” he says.
Silver is also careful to point out that Google DeepMind won’t put human mathematicians out of work. “Our goal is to provide a system that can prove anything, but that’s not the only job of mathematicians,” he says. “A big part of mathematics is posing problems and finding the interesting questions to ask. You could think of this as another tool similar to a slide rule, or a calculator, or computational tools.”