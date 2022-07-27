Latent embeddings of a framework colored by physical state variables. Credit: Boyuan Chen/Columbia Engineering



Energy, mass, speed. These three variables make up Einstein’s iconic equation E=MC2. But how did Einstein know about these concepts in the first place? A precursor to understanding physics is identifying relevant variables. Without the concept of energy, mass and speed, even Einstein would not be able to discover the theory of relativity. But can such variables be discovered automatically? This could greatly accelerate scientific discovery.

That is the question that researchers at Columbia Engineering asked a new AI program. The program is designed to observe physical phenomena with a video camera and then search for the minimum set of fundamental variables that fully describe the observed dynamics. The study was published on July 25 in Nature Computational Science.

The researchers started the system with raw video footage of phenomena they already knew the answer to. For example, they sent in a video of a swinging double pendulum known to have exactly four “state variables”: the angle and angular velocity of each of its two arms. After a few hours of analysis, the AI ​​came up with the answer: 4.7.











The picture shows a chaotic dynamic system of the swingstick in motion. The work aims to identify and extract the minimum number of state variables needed to directly describe such a system from high-dimensional video images. Credit: Yinuo Qin/Columbia Engineering

“We thought this answer was close enough,” said Hod Lipson, director of the Creative Machines Lab in the Department of Mechanical Engineering, where the work was primarily done. “Especially because all AI had access to raw video, without any knowledge of physics or geometry. But we wanted to know what the variables actually were, not just their number.”

The researchers then went on to visualize the actual variables the program identified. It was not easy to extract the variables themselves, because the program cannot describe them in an intuitive way that people understand. After some detective work, it turned out that two of the variables chosen by the program loosely matched the angles of the arms, but the other two remain a mystery.

“We tried to correlate the other variables with everything we could think of: angular and linear velocities, kinetic and potential energy, and various combinations of known quantities,” explains Boyuan Chen Ph.D., now an assistant professor at the University of Groningen. Duke University, out. who led the work. “But nothing seemed to match perfectly.” The team was convinced that the AI ​​had found a valid set of four variables because it made good predictions, “but we don’t understand the mathematical language it speaks yet,” he explained.

After validating a number of other physical systems with known solutions, the researchers fed videos of systems for which they did not know the explicit answer. The first videos featured an “air dancer” waving in front of a local used car parking lot. After a few hours of analysis, the program returned eight variables. A video of a lava lamp also yielded eight variables. They then ran a video clip of flames from a holiday fireplace loop, and the program returned 24 variables.

A particularly interesting question was whether the set of variables was unique to each system, or whether a different set was produced each time the program was restarted.

“I’ve always wondered if we would ever meet an intelligent alien race, would they have discovered the same laws of physics as we do, or could they describe the universe in a different way?” said Lipson. “Maybe some phenomena seem puzzlingly complex because we’re trying to understand them using the wrong set of variables. In the experiments, the number of variables was the same every time the AI ​​restarted, but the specific variables were different each time. So yeah, there were are alternative ways of describing the universe and it is very possible that our choices are not perfect.”

The researchers believe this kind of AI could help scientists uncover complex phenomena for which theoretical understanding cannot keep up with the deluge of data — areas ranging from biology to cosmology. “While we used video data in this work, any kind of array data source could be used, for example radar arrays or DNA arrays,” explains Kuang Huang, Ph.D., who is a co-author of the paper.

The work is part of Lipson and Fu Foundation Professor of Mathematics Qiang Du’s decades-long interest in creating algorithms that can distill data into scientific laws. Earlier software systems, such as Lipson and Michael Schmidt’s Eureqa software, were able to free-distribute physical laws from experimental data, but only if the variables were identified in advance. But what if the variables are not yet known?

Lipson, who is also the James and Sally Scapa Professor of Innovation, argues that scientists can misinterpret or misunderstand many phenomena simply because they don’t have a good set of variables to describe the phenomena.

“For thousands of years, people knew about fast or slow moving objects, but it wasn’t until the notion of speed and acceleration was formally quantified that Newton was able to discover his famous law of motion F = MA,” Lipson noted. Variables describing temperature and pressure had to be identified before laws of thermodynamics could be formalized, and so on for every corner of the scientific world. The variables are a precursor to any theory.

“What other laws are we missing simply because we don’t have the variables?” asked Du, who led the work.

The paper was also co-authored by Sunand Raghupathi and Ishaan Chandratreya, who helped collect the data for the experiments.

Boyuan Chen et al, Automated Discovery of Fundamental Variables Hidden in Experimental Data, Nature Computational Science (2022). Boyuan Chen et al, Automated Discovery of Fundamental Variables Hidden in Experimental Data,(2022). DOI: 10.1038/s43588-022-00281-6

Provided by Columbia University School of Engineering and Applied Science





