News Release

New AI model explores massive chemical space with minimal data

Starting with just 58 data points, a new artificial intelligence model identified four battery electrolytes that rival the state of the art

Peer-Reviewed Publication

University of Chicago

New AI model explores massive chemical space with minimal data

image: 

A new paper from the lab of UChicago PME Asst. Prof. Chibueze Amanchukwu of the University of Chicago Pritzker School of Molecular Engineering built an active learning model that was able to explore a virtual search space of one million potential battery electrolytes starting from just 58 data points.

view more 

Credit: UChicago Pritzker School of Molecular Engineering / Stephen L. Garrett

In an ideal world, an AI model looking for new materials to build better batteries would be trained on millions or even hundreds of millions of data points. 

But for emerging next-generation battery chemistries that don’t have decades of research behind them, waiting for new studies takes time the world doesn't have.

“Each experiment takes up to weeks, months to get data points,” said University of Chicago Pritzker School of Molecular Engineering (UChicago PME) Schmidt AI in Science Postdoctoral Fellow Ritesh Kumar. “It's just infeasible to wait until we have millions of data to train these models.”

Kumar is the co-first author of a recent paper published in Nature Communications that built an active learning model that was able to explore a virtual search space of one million potential battery electrolytes starting from just 58 data points. From this minimal data, the team from the lab of UChicago PME Asst. Prof. Chibueze Amanchukwu identified four distinct new electrolyte solvents that rival state-of-the-art electrolytes in performance.

To help hone the data from this small set, the team incorporated experiments as outputs, actually testing the battery components the AI suggested, then feeding those results back into the AI for further refinement.

“Often in the literature, we see computational proxies as an output, but there is still a difference between a computational proxy and a real-world experiment. So here we bit the bullet and went all the way to experiments as a final output,” he said. “If the model suggested, ‘Okay, go get an electrolyte in this chemical space,’ then we actually built a battery with that electrolyte, and we cycled the battery to get the data. The ultimate experiment we care about is: Does this battery have long cycle life?”

Trust but verify

Having an AI extrapolate millions of potential molecules from just 58 prompts can be fraught. The more a machine has to extrapolate, the greater the potential for spurious results, the chemical equivalent of a Dall-E portrait with six fingers or ChatGPT spewing gibberish.

“The model will be not very accurate initially, so it will have some prediction, and it will also have uncertainty associated with the prediction,” Kumar said. 

Predictions from AI trained on millions of data points would theoretically be more trustworthy, so the team verified along the way, testing and retesting to find electrolytes with the best discharge capacity.

In total, the team ran seven active learning campaigns with about 10 electrolytes tested in each before they zeroed in on four new electrolytes with top-tier performance.

“There's no way we can remove the inefficiency of machine learning and AI models completely, but we should take advantage of what it's good at, like we did in this case,” Kumar said. “The other alternative was that we do experiments on all one million electrolytes, which was not possible.”

Predictive to generative

One possible area of future study is tossing even the 58 data points and having an AI create new molecules from scratch, said co-first author Peiyuan Ma, PhD’24.

Currently, the lab’s AI model extrapolates molecules from existing molecules other researchers have described and compiled in databases. Turning a truly generative AI loose on the massive chemical space – potentially as much as 10 to the 60th power, or a one with 60 zeroes after it – could result in novel configurations no scientist ever dreamed.

“That would mean we're no longer limited by the existing literature,” Ma said. “The model, in principle, can suggest some molecules that do not exist in any database." 

Future AI models also need to evaluate potential electrolytes on multiple criteria. AI models evaluate battery components based on one factor, usually related to cycle life, Ma said. Cycle life is a battery’s most important performance aspect, but far from the only feature needed to make a battery that would be useful and impactful in the real world.

“For an electrolyte to be successfully commercialized, it needs to meet multiple criteria, like base capacity, safety, even cost,” Ma said. “We need future AI models to further filter the work, to pull the best electrolytes out from the best-performing electrolytes.”

Turning to AI and machine learning to find new molecules can help remove the blinders from science, Kumar said. There’s a natural human inclination to home in on chemical spaces that have already shown promising results rather than study new areas that could either change the world or waste time and resources.

“We are always biased toward what’s already available to us, but AI can provide us a way to come out of our bias,” Kumar said.

Citation: “Active learning accelerates electrolyte solvent screening for anode-free lithium metal batteries,” Ma et al, Nature Communications, September 25, 2025. DOI: 10.1038/s41467-025-63303-7


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.