Effective way to solve rotational inefficiency in autonomous traffic control—cutting data needs by 50%
By replacing absolute coordinates with a Relative Direction Layer, trainers achieve up to 70% faster convergence in just 2,000 episodes for cooperative navigation.
Higher Education Press
Researchers at the National University of Defense Technology and Shanghai Jiao Tong University have developed RDHNet. This novel neural network architecture significantly accelerates and stabilizes learning in multi-agent reinforcement learning (such as in robots or automated vehicles) by automatically accounting for rotational symmetry, eliminating redundant representations and enhancing decision-making in continuous-action tasks. This advance addresses the typical inefficiency caused when agents rely on absolute coordinates and must relearn the same behavior in rotated scenarios.
Think of RDHNet like a GPS that automatically rotates the map so that “forward” always matches the agent’s direction, allowing it to learn a maneuver once and apply it from any orientation.
Symmetry-Aware Design Supercharges Multi-Agent Systems from Traffic Control to Power-Grid Management
Multi-agent systems power applications from autonomous traffic control and power grid management to cooperative robotics and team-based gaming. By compressing the model’s hypothesis space through symmetry-aware design, RDHNet reduces the amount of data and computation required to train agents, making it easier to deploy intelligent teams in real-world settings where absolute positioning may be unavailable or unreliable.
Up to 70% Faster Learning and Double the Score in Predator–Prey Tests
The study’s results demonstrate clear advantages of RDHNet across a range of cooperative and competitive scenarios:
- RDHNet converges up to 50%–70% faster than state-of-the-art baselines (e.g., COMIX, FACMAC, MADDPG) in cooperative navigation tasks, achieving top performance in as few as 2,000 training episodes.
- In predator–prey scenarios with up to nine agents, RDHNet achieves the highest average returns in four out of five settings, often doubling the score of competing methods.
- The architecture requires only relative observations—angle and distance between entities—and succeeds even when global coordinates or compass information are withheld.
“By embedding rotational and permutation symmetries directly into our network, we’ve slashed training time by up to 70% and doubled performance in complex multi-agent tasks—without any reliance on absolute positioning. This opens the door to truly resilient, sample-efficient coordination for everything from autonomous traffic grids to robotic swarms,” says Prof. Minglong Li.
Relative Direction Layer Replaces GPS: Lightweight Hypernetworks Deliver Rotation-Invariant Policies
RDHNet replaces absolute coordinate inputs with a Relative Direction Layer that computes each agent’s observations in a polar coordinate frame centered on itself and aligned to a reference neighbor. Direction-independent features (like speed or size) are encoded via a small multilayer perceptron, while distances use radial basis functions and angles use sine–cosine embeddings. These per-entity embeddings feed into lightweight hypernetworks and a symmetric aggregation module to produce rotation- and permutation-invariant representations. Finally, a value-decomposition backbone (COMIX) generates decentralized policies and action-value estimates with guaranteed monotonicity—all without relying on absolute directional data.
Breaking Symmetry Barriers: New Architecture Unlocks Sample-Efficient, Scalable Team-Based Reinforcement Learning
By formally addressing both permutation and continuous rotational symmetries in multi-agent reinforcement learning, RDHNet opens the door to more sample-efficient, robust, and scalable solutions for complex team-based tasks. The code is available at github.com/wang88256187/RDHNet. This research article was published in Frontiers of Computer Science in April 2025 (https://doi.org/10.1007/s11704-025-41250-2).
Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.