News Release 2-Jul-2025

Peeking inside AI brains: Machines learn like us

New research reveals a surprising geometric link between human and machine learning. A mathematical property called convexity may help explain how brains and algorithms form concepts and make sense of the world

Peer-Reviewed Publication

Technical University of Denmark

The connection between human and machine learning — **image:**
A new connection between human and machine learning has been discovered: While conceptual regions in human cognition for long have been modelled as convex regions, Tetkova et al. present new evidence that convexity playes a similar role in AI. So-called *pretraining* by self-supervision leads to convexity of conceptual regions and the more convex the regions are, the better the model wil learn a given specialist task in supervised *fine-tuning*.
view more

Credit: DTU

In recent years, with the public availability of AI tools, more people have become aware of how closely the inner workings of artificial intelligence can resemble those of a human brain.

There are several similarities in how machines and human brains work, for example, in how they represent the world in abstract form, generalise from limited data, and process data in layers. A new paper in Nature Communications by DTU researchers is adding another feature to the list: Convexity.

"We found that convexity is surprisingly common in deep networks and might be a fundamental property that emerges naturally as machines learn," says Lars Kai Hansen, a DTU Compute professor who led the study.

Convexity may bridge human and machine intelligence

To briefly explain the concept, when we humans learn about a "cat," we don't just store a single image but build a flexible understanding that allows us to recognise all sorts of cats—be they big, small, fluffy, sleek, black, white, and so on.

Coming from mathematics to describe, e.g., geometry, the term convexity was applied to cognitive science by Peter Gärdenfors who proposed that our brains form conceptual spaces where related ideas cluster. And here's the crucial part: natural concepts, like "cat" or "wheel," tend to form convex regions in these mental spaces. In short, one could imagine a rubber band stretching around a group of similar ideas—that's a convex region.

Think of it like this: Inside the perimeter of the rubber band, if you have two points representing two different cats, any point on the shortest path between them also falls within the mental "cat" region. Such convexity is powerful as it helps us generalise from a few examples, learn new things quickly, and even helps us communicate and agree on what things mean. It's a fundamental property that makes human learning robust, flexible and social.

When it comes to deep learning models - the engines behind everything from image generation to chatbots - they learn by transforming raw data like pixels or words into complex internal representations, often called "latent spaces." These spaces can be viewed as internal maps where the AI organises its understanding of the world.

Measuring AI's internal structure

To make AI more reliable, trustworthy and aligned with human values, there is a need to develop better ways to describe how it represents knowledge. Therefore, it is critical to determine whether machine-learned spaces are organised in a way that resembles human conceptual spaces and whether they also form convex regions for concepts.

First author of the paper, Lenka Tetkova, who is a postdoc at DTU Compute, dove into this very question, looking at two main types of convexity:

First is Euclidean convexity, which is straightforward: if you take two points within a concept in a model's latent space, and the straight line between them stays entirely within that concept, then the region is Euclidean convex. This is like generalising by blending known examples.

The other is graph convexity, which is more flexible and especially important for the curved geometries often found in AI's internal representations. Imagine a network of similar data points—if the shortest path between two points within a concept stays entirely inside that concept, then it's graph convex. This reflects how models might generalise by following the natural structure of the data.

"We've developed new tools to measure convexity within the complex latent spaces of deep neural networks. We tested these measures across various AI models and data types: images, text, audio, human activity, and even medical data. And we found that the same geometric principle that helps humans form and share concepts—convexity—also shapes how machines learn, generalise, and align with us," says Lenka Tetkova.

AIs hidden order

The researchers also discovered that the commonalities are found in pretrained models that learn general patterns from massive datasets and finetuned models that are taught specific tasks like identifying animals. This further substantiates the claim that convexity might be a fundamental property that emerges naturally as machines learn.

When models are fine-tuned for a specific task, the convexity of their decision regions increases. As AI improves at classification, its internal concept regions become more clearly convex, refining its understanding and sharpening its boundaries.

In addition, the researchers discovered that the level of convexity in a pretrained model's concepts can predict how well that model will perform after finetuning.

"Imagine that a concept, say, a cat, forms a nice, well-defined convex region in the machine before it's even taught to identify cats specifically. Then it's more likely to learn to identify cats accurately later on. We believe this is a powerful insight, because it suggests that convexity might be a useful indicator of a model's potential for specific learning tasks," says Lars Kai Hansen.

A route to better AI

According to the researchers, these new results may have several important implications. By identifying convexity as a pervasive property, they have better understood how deep neural networks learn and organise information. It provides a concrete mechanism for how AI generalises, which may be like how humans learn.

If convexity does prove to be a reliable predictor of performance, it may be possible to design AI models that explicitly encourage the formation of convex concept regions during training. This could lead to more efficient and effective learning, especially in scenarios where only a few examples are available. The findings may therefore provide a crucial new bridge between human cognition and machine intelligence.

"By showing that AI models exhibit properties (like convexity) that are fundamental to human conceptual understanding, we move closer to creating machines that 'think' in ways that are more comprehensible and aligned with our own. This is vital for building trust and collaboration between humans and machines in critical applications like healthcare, education, and public service," says Lenka Tetkova.

"While there's still much to explore, the results suggest that the seemingly abstract idea of convexity may hold the key to unlocking new secrets on AI's internal workings and bringing us closer to intelligent and human-aligned machines."

The project

The research carried out within the research project “Cognitive Spaces – Next generation explainable AI” funded by the Novo Nordisk Foundation. The project’s aim is to open the machine learning black-box and build tools to explain the inner workings of AI-systems with concepts that can be understood by specific user groups.

Journal

Nature Communications

DOI

10.1038/s41467-025-60809-y

Article Title

On convex decision regions in deep network representations

Article Publication Date

2-Jul-2025

COI Statement

The authors declare no competing interests.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.