image: Demonstration results of multi-modal instruction. The first row lists the visual stimulus, whereas the second row depicts our intermediate reconstructions. The manipulation results via the instruction of “In the [V] style” are shown within red boxes of the last row. The images are sourced from Microsoft COCO.
Credit: Visual Intelligence, Tsinghua University Press
Decoding human thoughts from brain signals has long been a scientific aspiration, with recent progress in machine learning enabling partial reconstruction of images from functional magnetic resonance imaging (fMRI) recordings. However, existing methods mainly attempt to recover what a person is seeing or imagining, without offering ways to interactively modify these mental images. Bridging the gap between brain activity and language remains a fundamental challenge, as brain signals are abstract and often ambiguous. Moreover, successful communication requires selectively altering only the relevant features while preserving others. Based on these challenges, researchers sought to develop a system that could not only decode brain signals but also reshape them through guided instructions.
Researchers from the Tokyo Institute of Technology, Shanghai Jiao Tong University, KAUST, Nankai University, and MBZUAI have unveiled DreamConnect, a brain-to-image system that integrates fMRI with advanced diffusion models. Published (DOI: 10.1007/s44267-025-00081-2) in Visual Intelligence on July 2, 2025, the study introduces a dual-stream framework capable of interpreting brain activity and editing it with natural language prompts. The innovation demonstrates how artificial intelligence can progressively guide brain signals toward user-desired outcomes, marking an important step in connecting human “dreams” with interactive visualization.
DreamConnect introduces a dual-stream diffusion framework specifically designed for fMRI manipulation. The first stream interprets brain signals into rough visual content, while the second refines these images in line with natural language instructions. To ensure accurate synchronization, the team developed an asynchronous diffusion strategy that allows the interpretation stream to establish semantic outlines before the instruction stream applies edits. For instance, when a participant imagines a horse and requests it to become a unicorn, the system identifies relevant visual features and transforms them accordingly.
To further enhance precision, the framework employs large language models (LLMs), pinpointing the spatial regions relevant to instructions. This enables region-aware editing that avoids altering unrelated content. The researchers validated DreamConnect using the Natural Scenes Dataset (NSD), one of the largest fMRI collections, and generated synthetic instruction–image triplets with AI tools to expand training data.
Experimental results showed that DreamConnect not only reconstructs images with fidelity comparable to state-of-the-art models but also surpasses existing methods in instruction-based editing. Unlike other systems that require intermediate reconstructions, DreamConnect interacts directly with brain signals, streamlining the process of concept manipulation and visual imagination.
“DreamConnect represents a fundamental shift in how we interact with brain signals,” said Deng-Ping Fan, senior author of the study. “Rather than passively reconstructing what the brain perceives, we can now actively influence and reshape those perceptions using language. This opens up extraordinary opportunities in fields ranging from creative design to therapeutic applications. Of course, we must also remain mindful of ethical implications, ensuring privacy and responsible use as such technologies advance. But the proof-of-concept is a powerful demonstration of what is possible.”
The potential applications of DreamConnect are wide-ranging. In creative industries, designers could directly translate and refine mental concepts into visual outputs, accelerating brainstorming and prototyping. In healthcare, patients struggling with communication might use the system to express thoughts or emotions visually, offering new tools for therapy and diagnosis. Educational platforms may one day harness the technology to explore imagination-driven learning. However, the authors caution that ethical safeguards are essential, given the risks of misuse and privacy invasion. With further development, DreamConnect could become a cornerstone of future human–AI collaboration, where imagination and technology merge seamlessly.
Funding information
This study was supported by the National Natural Science Foundation of China (No. 62476143).
About the Authors
Dr. Deng-Ping Fan is a full professor and department chair of Computer Science and Technology at Nankai University. Previously, he was the Team Lead at the IIAI. Dr. Fan received his Ph.D. from Nankai University in 2019 and was a postdoctoral researcher at the ETH Zurich. He has published numerous high-quality papers, with more than 31,000 citations on Google Scholar and an H-index of 62. Some representative awards include a World Artificial Intelligence Conference Youth Outstanding Paper and two CVPR best paper nominations. Dr. Fan serves as an Associate Editor for IEEE TIP and as an area chair for top international AI conferences such as CVPR, NeurIPs, and MICCAI. He has been listed among the top 2% of global scientists by Stanford University. For more information, please pay attention to his research homepage https://dengpingfan.github.io/.
About Visual Intelligence
Visual Intelligence is an international, peer-reviewed, open-access journal devoted to the theory and practice of visual intelligence. This journal is the official publication of the China Society of Image and Graphics (CSIG), with Article Processing Charges fully covered by the Society. It focuses on the foundations of visual computing, the methodologies employed in the field, and the applications of visual intelligence, while particularly encouraging submissions that address rapidly advancing areas of visual intelligence research.
Journal
Visual Intelligence
Article Title
Connecting dreams with visual brainstorming instruction
Article Publication Date
2-Jul-2025