Formation Control of Multi-agent Nonlinear Systems using The State-Dependent Riccati Equation
Pubblicato online: 31 mar 2025
Pagine: 17 - 32
Ricevuto: 10 gen 2024
Accettato: 23 lug 2024
DOI: https://doi.org/10.14313/jamris-2025-003
Parole chiave
© 2025 Saeed Rafee Nekoo, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
This paper presents research on controlling a highly-populated multi-agent system within the framework of the state-dependent Riccati equation (SDRE), which is a nonlinear optimal closed-loop control policy. A multi-agent system consists of a leader and a population of followers, playing the role of agents. The agents in a swarm system show a collective behavior, with interactions achieving a common goal [1]. The previous reports on swarm robotics have been quite diverse in subject matter: technological aspects of swarm robotics [2,3]; arrangement, formation, and behavior [4–6]; and bio-inspiration [7,8], among others.
The application of swarm robotics has been reported in many works, including micro-robotics: the interaction and physical contact between the agents [9–11], nano-swarms for delivering medicine or performing an operation in human body [12], search and rescue [13–15], monitoring and data collection [16, 17], etc.
Multiple decision-making agents with auto–nomous operation, social ability interaction, reactivity perception, and pro-activeness exhibition, were motivated by complex inherently distributed systems to increase robustness. Multi-agent and swarm systems are similar in the sense that they require multiple agents that communicate and cooperate. Each agent in multi-agent systems can do some meaningful part of a task; however, in swarm systems, we would expect each agent to be too unaware of its environment to possibly even know what is going on around it. This emphasizes the existence of a large number of agents, and promotes scalability, soft collisions and physical contacts of the swarm in migration. This might not be the case for swarms in nature, i.e., in birds, which do not collide, and also do not receive a command from leader [18]. Here, in this current work, the modeling is not based on bio-inspiration, since we are far away from such sophisticated design in mother nature.
Multi-agent systems, or cooperative robots, can usually be fit into these two categories: systems that considered complex dynamics for each agent, though the number of agents is small [19–24], and systems with simple dynamics or only kinematics for each agent when the number of agents is large [25–32]. For example, Yang et al. presented a formation control strategy based on a stable control law to guide the agents towards a constraint region and simulated it for a total of 155 agents [32]. Collision avoidance and obstacle avoidance between the agents are also two important characteristics of the formation methods. Wang and Xin presented an optimal control approach for formation control taking into consideration obstacle avoidance for multiple unmanned aerial vehicle (UAV) systems [33]. Kumar et al. presented a velocity controller for a swarm of UAVs with obstacle avoidance capability [34]. Park and Yoo proposed a connectivity-based formation control with collisionavoidance properties for a swarm of unmanned surface vessels [35,36].
The application of the swarm regulation control is devoted to multi-copter UAV formation control for a large number of agents–1,089–to show the capabilities of the proposed approach. Swarm control using augmented methods such as graph theory and multilayer network design is a useful approach; however, in this work, the pure capacity of the SDRE is used to control a highly populated multi-agent system. This will simplify the design and limit the formulation to pure SDRE. The advantages of this point of view result in controlling large-scale systems with complex dynamics. A fully coupled six-degree-of-freedom (DoF) nonlinear model of the system is considered (12 state variables). The second case study is a leaderfollower system of 45-wheeled mobile robots (WMRs) with non-holonomic and holonomic constraints of wheels. A smaller number of robots was considered to highlight the effectiveness of obstacle and collision avoidance. SDRE has been used in both multi-agent systems [37–39], and leader-follower control [22,40 – 47]. The details of the number of agents using SDRE are reported on Table 1. The majority of these studies focused on the control and behavior of the leaderfollower system, first considering two agents and then sometimes increasing the number of agents to 5 or 10. The only large populated system using SDRE control included 1,024 satellites; obstacle avoidance and collision avoidance between agents were not considered [39]. The large distance between the agents and the pattern of the satellites did not necessitate collision/obstacle avoidance. In this work, a more mature version of multi-agent SDRE [39] is presented to add those characteristics, since the agents are working close together in both solved examples.
A detailed report on the population of agents in SDRE in previous, as literature compared to this work.
context | Ref. | No. of agents, type of agent |
---|---|---|
multi‐agent |
[37] |
5, single‐integrator |
consensus control | [22] | 10, crane |
cooperative |
[48] |
2, manipulator |
multi‐agents |
this work |
The swarm formulation in control methods could be solved by graph theory for the formation design of the whole swarm; in that case, another controller must control the agents, individually by checking the connection for the agents through multiple layers [53–55]. The design of the formation using graph theory, and adapting a controller for handling the multiagent system, might pose difficulty in implementation because a controller must include other layers of designs through external methods.
The reported number of agents was 16 satellites [53], 6 agents [54], and 90 agents [55]. The use of the Kronecker delta function for the selection of information from a particular agent limits the application of the swarm multi-agent formulation to a limited number. In the SDRE method particularly, the solution to the Riccati equation might become overly complex if the SDRE should be solved for the entire swarm. In this approach, the gain of the targeted agent would be collected from the overall matrix of the swarm. Again, this approach would limit the implementation of the formulation for a large number of agents. The reported SDRE agents, using consensus control and graph theory, were 10 [22] and 2 [38], respectively.
The focus of this work is large-scale multi-agent systems that need simplicity in their design. This simplicity does not imply a simple linear controller; on the contrary, the SDRE is a nonlinear sub-optimal controller with a relatively complex solution. However, the practical approach to handle the multi-agent system relies solely on the SDRE’s capabilities itself, and uses the weighting matrix of the states to handle distance constraints and obstacle avoidance. As a result, there would not be further complexity for practical implementation raised by the interaction of the SDRE with another layer or graph theory. This can be viewed as independent SDRE controllers working with neighbor components and their leader through a weighting matrix, which is inside the umbrella of the SDRE, not another system augmented by the multiagents or other complementary methods. This is the key to large-scale modeling and simulation–which is simplicity in the communication of the multi-agent system and reliance on the capacities of the SDRE itself.
The main contribution of this work is the presentation of a control structure for forming a multi-agent system with complex dynamics with obstacle and collision avoidance between the agents. Complex systems with a large number of agents are seldom reported; this work presents a suitable structure for that. This work adds collision and obstacle avoidance to SDRE multi-agent system control, motivated by the example set by [39], which reported solely multi-agent system SDRE without contact prevention. For very crowded systems, contact between the agents has been both expected and considered in the literature [30,32], but this work proposes a method that adds distance between the agents to reduce collisions during the control task.
The distance-based arrangement of agents, and avoiding collision between them, is basically the idea of multi-agent system control methods that use graph theory, potential field method, multi-layer designs, and the like; here in this work, the distribution, as well as obstacle, and collision avoidance, are structured within the SDRE formulation–specifically, the weighting matrix of states.
Section 2 presents the control structure of the SDRE and multi-agent system modeling. Section 3 describes the dynamics of the two case studies of this work, UAV, and WMR. The simulation results are presented in Section 4 and the concluding remarks are summarized in Section 5.
Consider
The SDC parameterization (2) is referred to as apparent linearization; however, it preserves the nonlinearity of the system. For example, consider an over-actuated single-degree-of-freedom system in state-space form with
The result of multiplication of
The control objective is to minimize the cost function:
The control law of the SDRE is [56]:
Without loss of generality, the input law (4) is transformed to:
The definition of formation control is introduced by a set of time-varying desired conditions for the agents to follow the leader of the group. The first agent,

A schematic view of the multi-agent system, moving from an initial randomly-distributed form to a final desired shape with arranged agents in an environment with an obstacle. The 2D representation is intended to show more detail; its formation is general and could be applied to 3D formation as well.
The rest of the agents follow the leader while they try to keep the predefined geometric formation during the task and sit at the exact predefined geometric formation at the end of the task:
The weighting matrix for the states of the SDRE controller can include nonlinear functions of state variables in its components. This property provides obstacle avoidance in regulation control based on artificial potential field method [52,58,59]. The distance between an obstacle and an agent is:
The next step is to define the distance between two consecutive agents for collision avoidance. The distance between the
The index is assigned to a given agent randomly; as a result, two consecutive agents might not be neighbors at the initial condition and could even be far away from each other. This initialization might raise issues. The collision avoidance between neighbor agents (8) might not be effective and collision might happen between
To include the collision avoidance between the f-th agent and all previous members, the following condition is introduced:
Equation (9) is updated at each time step of simulation when all the agents are moving from the initial to the final condition. In cases in which the
The state-dependent weighting matrices of the SDRE are one of the advantages of this method over a linear quadratic regulator (LQR) controller. The weighting matrices of the LQR are constant and cannot include parameters related to trajectories such as–the introduced distance terms in Section 2.3– for obstacle avoidance, collision avoidance, or adaptive designs. The definitions of the weighting matrices for the leader and followers are different in order to gain different motion speeds. The leader must be slower and the followers must be faster to track the leader properly. The following definition of the weighting matrix will provide for this condition.
The weighting matrix of states for the leader only includes obstacle avoidance:
The followers
In this section, three different distance terms were introduced:
Each distance term plays its own role in the migration of the multi-agent system.
The definition of the rest of the components of the weighting matrix of states, as well as the inputs, could be done based on conventional tuning methods and rules [60].
The dynamic equation of one quadrotor unmanned aerial vehicle (UAV) is presented in this section. The schematic view of the system and the configuration of the rotors are illustrated in Fig. 2. The agents are identical and share the same dynamics. The generalized coordinates of one system are

The schematic view and axis definition of a quadrotor.
The kinematics relations
Therefore, the derivative of state-vector (12) generates
Assuming small changes in Euler angels during flight control, the derivative of the kinematics equation (13):
shows that in a hovering state,
The quadrotor UAV is underactuated, and the control structure uses a cascade design, which controls the translation part of dynamics (14) in the first layer and the orientation in the second layer. The translation dynamics includes the states
The component
In order to implement the cascade design and SDRE control, in the first step, the SDRE controller must be operated with the assumption of full actuation condition, which demands a
Assigning weighting matrices with proper dimensions and solving the SDRE for the translation part:
Note that gravity is compensated for in (16), and for this reason, the SDC matrix
The SDRE for the orientation control is:
The input thrust and torque vector to the system are generated by four rotating propellers:
Consider

The schematic view and axis definition of a differential-wheel mobile robot.
The first auxiliary relation is found by combining the constraints on the left and right wheels,
The non-holonomic constraint, derived from the rolling condition of the wheels, is obtained [67]:
Equation (19) is the holonomic constraint of the system and equations
The dynamics equation of the mobile robot is the common form ofa second-order differential equation:
The angular velocities of the wheels are arranged in a vector
The state-vector of the system is chosen as
In the control scheme of WMR, the controlled output of the system is a point
Substituting (27) into
By defining a new state-vector for the output dynamics (29),
The outputand state-dependent coefficient parameterization of system (30) is expressed as [57]:
The necessary conditions of the output, controllability, and observability of the OSDRE, and condition on
A multi-agent system of
The leader is the first agent of the system and guides the multi-agent system in point-to-point regulation from an arbitrary start to an endpoint. The 1,089 agents are initially scattered randomly inside a 10 × 10(m) square at the height of 10(m). The initial orientation angle and velocities of the translation and orientation states of the agents are zero. The three components of the pseudocode (
The position of the obstacle is (
The physical characteristics of the agents are similar in the following way. The distance between the motor and the CoM of the quadrotor is
The motion of the multi-agent system is represented in Fig. 4. Because the agents are very close to each other at the beginning, and the collision avoidance scheme in tuning increases the

The trajectories of the multi-agent system of 1,089 quadrotor UAVs.

The error convergence of the multi-agent system of 1,089 quadrotor UAVs.
The weighting matrices of the leader are selected as:
The weighting matrices of the followers are:
Obstacle avoidance by embedding an artificial potential field into the SDRE has been reported on and analyzed in the literature. It has been stated that this method does not guarantee absolute obstacle avoidance, though it reduces the chance of impact significantly [52]. Figure 6 shows that the minimum distance between some agents and the obstacle was 30(cm). The collision avoidance between the agents is reported in Fig. 7. A group of drones in Fig. 7 merged to almost a 30(cm) distance; those are the ones at the beginning of rows, with the previous agents located at the end of previous rows. For the rest of the agents, the collision avoidance term,

The distance between the drones and the obstacle for the multi-agent system of 1,089 quadrotor UAVs.

The collision distance between the drones for the multi-agent system of 1,089 quadrotor UAVs.
Checking the collision between two consecutive agents does not necessarily show all the collisions. Other agents might also have collisions that are demonstrated in Fig. 8.

The
Contrary to
The simulation of 1, 089 agents, presented in Fig. 4 might not reveal the complexity of the multi-agent system dynamics, therefore, the data of the simulation for the 5th agent is presented. The position and velocity states of the 5th drone are presented in Figs. 9 and 10, respectively. The equation of the mixer matrix (18) presents the input angular velocities of the rotors, limited to the lower and upper bounds, Fig. 11.

The position and orientation states for the 5th quadrotor UAV.

The linear and angular velocity states for the 5th quadrotor UAV of the multi-agent system.

The angular velocities of the rotors for the 5th quadrotor UAV.
A group
The position of the obstacle is set as (
The physical characteristics of the mobile robots were chosen as follows: the radius of the wheels is denoted by
The weighting matrices of the leader are set:
The weighting matrices of the followers are selected as:
The trajectories of the leader and follower agents are illustrated in Fig. 12. The leader and the followers avoided the obstacle, which are represented in the trajectories. The error convergence of the multi-agent system is presented in Fig. 13. The obstacle and collision avoidance terms are plotted in Figs. 14 and 15, respectively. The input signals and state variables of one agent are presented in Figs. 16 and 17, respectively. The error of the leader and 5th follower is found at 122.6741(mm) and 411.7172(mm) respectively. It is important to notice that the error is a function of the look-ahead control parameter which was set

The trajectories of the leader and 45 follower agents.

The errors of the leader and 45 follower agents.

The obstacle avoidance performance of the leader/follower system of mobile robots.

The collision avoidance performance of the leader/follower system of mobile robots.

The input signals of the wheels for one of the agents.

The state information for one of the agents.

The migration of 1,050 mobile robots in leader-follower formation.
The error of the system under different
error leader (mm) | error 5th follower (mm) | |
---|---|---|
20 | 144.3912 | 911.3158 |
50 | 76.4762 | 345.8130 |
80 | 104.3379 | 388.0787 |
100 | 124.3148 | 427.7242 |
200 | 224.7471 | 630.6206 |
300 | 324.6205 | 827.9986 |
400 | 424.4897 | 1025.9861 |
500 | 524.4070 | 1223.8401 |
There are some general rules for the selection of weighting matrices of the SDRE.
Hence, the diagonal components of
This work presents an SDRE control design (without augmentation of other techniques) for the formation regulation of a multi-agent system of robotic systems that takes into account the complex dynamics of the agents. The command to the multi-agent system is given to the leader, and the followers pursue the leader in accordance with the predefined pattern of the multi-agent system. Obstacle and collision avoidance among the agents was designed through the modification of the weighting matrices of states and the artificial potential field method. The application of the proposed method was devoted to the control of unmanned multi-copter systems and differential wheeled mobile robots. A highly populated multi-agent system of 1,089 agents was simulated for the UAV, and another simulation was presented for the WMRs with 45, and then 1,050 agents which were successfully regulated in the control task, preserving the expected formation shapes.