In the field of multi-robot system research, encirclement control is a major topic. It finds widespread applications in both civilian and military domains, including collaborative escorting, capturing enemy targets, reconnaissance surveillance, unmanned surface vessel patrols, and hunting, among others.
The core challenge in these applications is how to control a multi-robot system involving multi-target allocation while simultaneously addressing target encirclement and collision avoidance issues. This poses a significant challenge, particularly for decentralized multi-robot systems.
Professor Pu Zhiqiang's team from the Institute of Automation, Chinese Academy of Sciences, presented a paper at the 2022 ICRA conference. They proposed a deep reinforcement learning method based on a relationship graph, demonstrating good adaptability to the problem of multi-target encirclement with collision avoidance (MECA) under various conditions.
ieeexplore
Problem Formulation
The research study specifies a MECA task, wherein an environment with L static obstacles (black circles) involves a multi-robot system composed of N robots (green circles). The objective is to collaboratively encircle K (1 < K < N) stationary or moving targets (red circles).
All robots must autonomously form multiple groups to encircle all targets. Each group is required to form a circular formation to encircle an individual target while avoiding collisions. This involves addressing the following three sub-problems:
1) Dynamic multi-target allocation and grouping
2) Encirclement of targets by each group
3) Collision avoidance between groups
Illustration of MECA for a decentralized multi-robot system
Method
In the context of the MECA problem, there are three types of entities: robots, targets, and obstacles. Different entities have varying impact relationships with robots, such as obstacle avoidance, target encirclement, and cooperation with other robots.
The study proposes a decentralized Deep Reinforcement Learning (DRL) approach based on robot-level and target-level Relationship Graphs (RGs), named MECA-DRL-RG method.
Specifically:
1. Utilize Graph Attention Networks (GATs) to model and learn robot-level RGs. These RGs consist of three heterogeneous relationship graphs between each robot and other robots, targets, and obstacles.
2. Use GAT to construct target-level RGs, capturing spatial relationships between robots and various targets. Target motion is modeled by the target-level RG and learned through supervised learning to predict target trajectories.
3. Additionally, a knowledge-embedded composite reward function is defined to address the multi-target problem in MECA. Actor-Critic training algorithms, based on centralized training and decentralized execution frameworks, are employed to train the policy network.
Overall structure of MECA-DRL-RG
Experiment
The research team conducted both simulation experiments and experiments in a real environment. In the real experiment, the scenario was set as follows: 6 robots encircling 2 moving targets in an environment with 2 obstacles. The position and velocity data of the robots were recorded by the NOKOV motion capture system.
Snapshots of 6 robots encircling 2 targets in 2 obstacle environment
Both simulation and real experiments confirmed that, compared to other methods, the MECA-DRL-RG approach enables robots to learn heterogeneous spatial relationship graphs from the surrounding environment. It allows them to predict target trajectories, thereby enhancing each robot's understanding and prediction of its surrounding environment. The effectiveness of the MECA-DRL-RG method has been verified.
Moreover, regardless of an increase in the number of robots, obstacles, or targets, or an acceleration in target movement speed, the MECA-DRL-RG method demonstrated excellent performance and exhibited broad adaptability.
Training curves of MECA-DRL-RG
Bibliography: