
Table of Contents

AI-Descartes, an AI scientist produced by researchers at IBM Analysis, Samsung AI, and the College of Maryland, Baltimore County, has reproduced essential components of Nobel Prize-profitable function, including Langmuir’s fuel conduct equations and Kepler’s 3rd regulation of planetary motion. Supported by the Protection State-of-the-art Research Jobs Company (DARPA), the AI method utilizes symbolic regression to obtain equations fitting knowledge, and its most distinctive function is its sensible reasoning means. This allows AI-Descartes to decide which equations greatest healthy with track record scientific principle. The process is specially productive with noisy, authentic-entire world facts and small information sets. The group is operating on producing new datasets and training pcs to read through scientific papers and build history theories to refine and increase the system’s capabilities.
The system demonstrated its chops on Kepler’s 3rd legislation of planetary motion, Einstein’s relativistic time-dilation law, and Langmuir’s equation of gas adsorption.
AI-Descartes, a new AI scientist, has effectively reproduced Nobel Prize-winning perform applying rational reasoning and symbolic regression to discover exact equations. The method is effective with authentic-globe knowledge and little datasets, with potential objectives which include automating the design of qualifications theories.
In 1918, the American chemist Irving Langmuir revealed a paper examining the conduct of gasoline molecules sticking to a solid surface area. Guided by the results of cautious experiments, as effectively as his theory that solids give discrete internet sites for the gas molecules to fill, he worked out a sequence of equations that explain how substantially gas will stick, offered the pressure.
Now, about a hundred many years later, an “AI scientist” produced by researchers at IBM Analysis, Samsung AI, and the University of Maryland, Baltimore County (UMBC) has reproduced a vital part of Langmuir’s Nobel Prize-profitable operate. The system—artificial intelligence (AI) working as a scientist—also rediscovered Kepler’s 3rd law of planetary movement, which can calculate the time it requires a person room object to orbit a different given the length separating them, and generated a good approximation of Einstein’s relativistic time-dilation law, which shows that time slows down for quickly-shifting objects.
The study was supported by the Protection Advanced Study Jobs Company (Nature Communications.
A machine-learning tool that reasons
The new AI scientist—dubbed “AI-Descartes” by the researchers—joins the likes of AI Feynman and other recently developed computing tools that aim to speed up scientific discovery. At the core of these systems is a concept called symbolic regression, which finds equations to fit data. Given basic operators, such as addition, multiplication, and division, the systems can generate hundreds to millions of candidate equations, searching for the ones that most accurately describe the relationships in the data.
AI-Descartes offers a few advantages over other systems, but its most distinctive feature is its ability to logically reason, says Cristina Cornelio, a research scientist at Samsung AI in Cambridge, England who is first author on the paper. If there are multiple candidate equations that fit the data well, the system identifies which equations fit best with background scientific theory. The ability to reason also distinguishes the system from “generative AI” programs such as ChatGPT, whose large language model has limited logical skills and sometimes messes up basic math.
“In our work, we are merging a first-principles approach, which has been used by scientists for centuries to derive new formulas from existing background theories, with a data-driven approach that is more common in the machine learning era,” Cornelio says. “This combination allows us to take advantage of both approaches and create more accurate and meaningful models for a wide range of applications.”
The name AI-Descartes is a nod to 17th-century mathematician and philosopher René Descartes, who argued that the natural world could be described by a few fundamental physical laws and that logical deduction played a key role in scientific discovery.
Suited for real-world data
The system works particularly well on noisy, real-world data, which can trip up traditional symbolic regression programs that might overlook the real signal in an effort to find formulas that capture every errant zig and zag of the data. It also handles small data sets well, even finding reliable equations when fed as few as ten data points.
One factor that might slow down the adoption of a tool like AI-Descartes for frontier science is the need to identify and code associated background theory for open scientific questions. The team is working to create new datasets that contain both real measurement data and an associated background theory to refine their system and test it on new terrain.
They would also like to eventually train computers to read scientific papers and construct the background theory themselves.
“In this work, we needed human experts to write down, in formal, computer-readable terms, what the axioms of the background theory are, and if the human missed any or got any of those wrong, the system won’t work,” says co-author Tyler Josephson, assistant professor of Chemical, Biochemical and Environmental Engineering at UMBC. “In the future,” he says, “we’d like to automate this part of the work as well, so we can explore many more areas of science and engineering.”
This goal motivates Josephson’s research on AI tools to advance chemical engineering.
Ultimately, the team hopes their AI-Descartes, like the real person, may inspire a productive new approach to science. “One of the most exciting aspects of our work is the potential to make significant advances in scientific research,” Cornelio says.
Reference: “Combining Data and Theory for Derivable Scientific Discovery with AI-Descartes” 12 April 2023, Nature Communications.
DOI: 10.1038/s41467-023-37236-y
Funding: Defense Advanced Research Projects Agency