Article Highlight | 12-May-2025

Few-shot object detection via dual-domain feature fusion and patch-level attention

Tsinghua University Press

Researchers at the Institute of Automation, Chinese Academy of Sciences and Peking University have recently developed a novel approach to few-shot object detection. This innovative method, detailed in the study "Few-shot Object Detection via Dual-domain Feature Fusion and Patch-level Attention," published in Tsinghua Science and Technology, brings new idea to the field by enhancing the detection capabilities for novel object classes with limited training data.

 

“In the rapidly evolving world of computer vision, few-shot object detection aims to detect objects belonging to new classes using only the limited labeled samples. This task is particularly challenging due to the limited availability of annotated data for novel classes.” Peiyu Guan, the corresponding author of the study, told EurekAlert!. “Traditional methods have problems on adaptability of novel class features and fine-grained feature extraction, which are crucial for detecting new object classes.”

 

To address these challenges, the researchers developed a two-stream backbone network consisting of a base domain stream and an elementary domain stream. The base domain stream utilizes features learned from abundant data of base classes, while the elementary domain stream preserves more category-agnostic features by directly using the pre-trained model on large-scale classification dataset. By integrating these two streams in parallel, the proposed method DFFPA is able to extract more discriminative features.

 

“The core of DFFPA lies in its dual-domain feature fusion module, which adaptively fuses feature pairs from the two streams to create robust representations,” Junzhi Yu, one author from Peking University, explained. “Besides, a patch-level attention is introduced to refine features at the ROI pooling stage. This processing mines the relations among patches, which highlights crucial object parts and thus improves the model’s adaptability to novel classes.”

 

“Our method leverages both transfer learning and attention to achieve better performance in few-shot object detection,” added Zhiqiang Cao, one author of the study. “By enriching feature diversity and focusing on fine-grained features, the detection accuracy of novel object classes is improved.”

 

Extensive experiments conducted on PASCAL VOC and MS COCO datasets demonstrate the effectiveness of DFFPA. The results show the merit of DFFPA.

 

“The potential applications of this technology are broad,” said Guangli Ren, the first author of the study. “From autonomous driving to robotics, few-shot object detection can significantly enhance the object detection capability of these systems, enabling them to quickly adapt to new scenes.”

 

The researchers plan to continue refining DFFPA and exploring its applications in real-world scenarios. With the continuous advancement of artificial intelligence and computer vision, few-shot object detection is expected to play a pivotal role.

 

Other contributors include Jierui Liu, Mengyao Wang from the State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.

 

This work was supported in part by Beijing Natural Science Foundation and in part by the National Natural Science Foundation of China.


About the Authors

Guangli Ren received the B.E. degree in intelligent science and technology from Dalian Maritime University, Dalian, China, in 2015, and the M.E. degree in technology of computer application from the Capital Normal University, Beijing, China, in 2018. In 2024, he received the Ph.D. degree in control theory and control engineering from the Institute of Automation, Chinese Academy of Sciences, Beijing, China. He is currently an assistant professor in the Institute of Software, Chinese Academy of Sciences. His research interests include visual SLAM and robotic manipulation.

Peiyu Guan received the B.E. degree in electronic information science and technology from Jilin University, Changchun, China, in 2017. In 2022, she received the Ph.D. degree in control theory and control engineering from the Institute of Automation, Chinese Academy of Sciences, Beijing, China. She is currently an assistant professor in the Institute of Automation, Chinese Academy of Sciences. Her research interests include service robot and image processing.

 

[1] G. Ren, J. Liu, M. Wang, P. Guan, Z. Cao and J. Yu, "Few-Shot Object Detection via Dual-Domain Feature Fusion and Patch-Level Attention," in Tsinghua Science and Technology, vol. 30, no. 3, pp. 1237-1250, June 2025, doi: 10.26599/TST.2024.9010031.

 

About Tsinghua Science and Technology

Tsinghua Science and Technology is sponsored by Tsinghua University and published bimonthly, 2023 Impact Factor of 5.2, ranking in Q1 in the "Computer Science, Software Engineering", "Computer Science, Information System", and "Engineering, Electrical & Electronic" areas in SCIE, according to JCR 2023. This journal aims at presenting the achievements in computer science, electronic engineering, and other IT fields. This journal has been indexed by SCIE, EI, Scopus, etc. Contributions all over the world are welcome.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.