image: The pipeline of the proposed Motion Cue Fusion Network (MCFNet) is as follows. Voxels generated from raw event streams are first processed by the Event Correction Module (ECM) to produce high-quality event frames. These frames, along with RGB images, are then fed into two separate CSPDarkNets for modality-specific feature extraction. The Event Dynamic Upsampling Module (EDUM) takes the features from the stage-3 layer to align the spatial resolutions of both modalities. This is followed by the Cross-modal Mamba Fusion Module (CMM) to perform cross-modal fusion. FPN combined with PANet further integrates multi-scale features, and finally, the decoder outputs category and bounding box predictions for each detected object.
Credit: Communications in Transportation Research
The dynamic range limitation is intrinsic to conventional RGB cameras, which reduces global contrast and causes the loss of high-frequency details such as textures and edges in complex, dynamic traffic environments (e.g., nighttime driving or tunnel scenes). This deficiency hinders the extraction of discriminative features and degrades the performance of frame-based traffic object detection. To address above problems, professors Xiangmo Zhao, Zhanwen Liu, and their research team at Chang’an University introduced a bio-inspired event camera integrated with an RGB camera to complement high dynamic range information, and proposed a Motion Cue Fusion Network (MCFNet), an innovative fusion network that optimally achieves spatiotemporal alignment and develops an adaptive strategy for cross-modal feature fusion, to overcome performance degradation under challenging lighting conditions.
They published their study on 18 August 2025, in Communications in Transportation Research.
The heterogeneity between RGB and event cameras
The heterogeneity causes spatiotemporal inconsistencies in multimodal data, posing challenges for existing methods in multimodal feature extraction and alignment. First, in the temporal dimension, the microsecond-level temporal resolution of event data is significantly higher than the millisecond-level resolution of RGB data, resulting in temporal misalignment and making direct multimodal fusion infeasible.
To address this issue, the researchers design an Event Correction Module (ECM) that temporally aligns asynchronous event streams with their corresponding image frames through optical-flow-based warping. The ECM is jointly optimized with the downstream object detection network to learn task-ware event representations.
Furthermore, the Event Dynamic Up-sampling Module (EDUM) enhances the spatial resolution of event data to align its distribution with the structures of image pixels, achieving precise spatiotemporal alignment.
The distribution of discriminative region is inconsistent in uneven illumination scenes for both modalities.
In the study, the researchers observed that Uneven illumination causes inconsistent distributions of discriminative regions across the two modalities. Therefore, it is necessary to dynamically balancing the contributions of each modality to achieve robust cross-modal feature fusion.
To solve this, the researchers introduce the Cross-modal Mamba Fusion Module (CMM), which employs adaptive feature fusion through a novel cross-modal interlaced scanning mechanism, effectively integrating complementary information for robust detection performance.
Experiments on the DSEC-Det and PKU-DAVIS-SOD datasets demonstrate that MCFNet significantly outperforms existing methods in low light and varying exposure traffic scenarios. This work proposes a framework for achieving robust object detection in dynamic traffic scenarios, providing a feasible idea for subsequent development of intelligent transportation solutions.
The above research is published in Communications in Transportation Research (COMMTR), which is a fully open access journal co-published by Tsinghua University Press and Elsevier. COMMTR publishes peer-reviewed high-quality research representing important advances of significance to emerging transport systems. COMMTR is also among the first transportation journals to make the Replication Package mandatory to facilitate researchers, practitioners, and the general public in understanding and advancing existing knowledge. At its discretion, Tsinghua University Press will pay the open access fee for all published papers in 2025.
About Communications in Transportation Research
Communications in Transportation Research was launched in 2021, with academic support provided by Tsinghua University and China Intelligent Transportation Systems Association. The Editors-in-Chief are Professor Xiaobo Qu, a member of the Academia Europaea from Tsinghua University and Professor Shuai’an Wang from Hong Kong Polytechnic University. The journal mainly publishes high-quality, original research and review articles that are of significant importance to emerging transportation systems, aiming to serve as an international platform for showcasing and exchanging innovative achievements in transportation and related fields, fostering academic exchange and development between China and the global community.
It has been indexed in SCIE, SSCI, Ei Compendex, Scopus, CSTPCD, CSCD, OAJ, DOAJ, TRID and other databases. It was selected as Q1 Top Journal in the Engineering and Technology category of the Chinese Academy of Sciences (CAS) Journal Ranking List. In 2022, it was selected as a High-Starting-Point new journal project of the “China Science and Technology Journal Excellence Action Plan”. In 2024, it was selected as the Support the Development Project of “High-Level International Scientific and Technological Journals”. The same year, it was also chosen as an English Journal Tier Project of the “China Science and Technology Journal Excellence Action Plan Phase Ⅱ”. In 2024, it received the first impact factor (2023 IF) of 12.5, ranking Top1 (1/58, Q1) among all journals in “TRANSPORTATION” category. In 2025, its 2024 IF was announced as 14.5, maintaining the Top 1 position (1/61, Q1) in the same category. Tsinghua University Press will cover the open access fee for all published papers in 2025.
Journal
Communications in Transportation Research
Article Title
Beyond Conventional Vision: RGB-Event Fusion for Robust Object Detection in Dynamic Traffic Scenarios
Article Publication Date
18-Aug-2025