With various types of indoor mobile robots being widely used in human life and production, robot Simultaneous Localization and Mapping (SLAM) technology, as the basis for robots to complete service work, has gradually become a research hotspot [1]. Affected by the diversity and complexity of the indoor environment, the information that can be obtained by the simultaneous localization and mapping of the common single radar is limited, so it cannot construct complete three-dimensional map information [2]. Therefore, multi-sensor fusion has become a new method to solve the defects of indoor robot SLAM. At present, the mainstream sensors applied to solve SLAM problem are lidar and vision camera, each of which has different advantages [3]. Among them, two-dimensional lidar has the advantages of small error and high reliability, but it has no vertical direction information and is easy to cause waste of resources. The depth camera has the advantages of a wide field of view and can identify specific objects, but the accuracy and stability of mapping are poor, and the probability of deviation is high [4-5]. At the moment, many scholars have proposed many fusions SLAM schemes based on the complementarity of perception sensors.
In terms of the fusion of vision and pose data Measurement sensors, literature [6] uses the coupling method to fuse the camera with the Inertial Measurement Unit (IMU) and the odometer, so as to obtain more accurate pose data. However, because the use is not tightly coupled, the vision sensor is prone to photosensitivity and instability. Literature [7] transfers the data collected by the IMU as the initial pose to the vision sensor, which reduces the data pressure of the depth camera in the mapping to a certain extent, but it is prone to data disorder. If the data in the IMU unit is wrong, it will lead to problems in the vision part. In terms of wheeled odometer and laser sensor, literature [8] uses the Extended Kalman Filter (EKF) algorithm to allow the odometer to provide pose while fusing the point cloud data of lidar to make the obtained data more accurate and reliable. In literature [9], key frame data are identified by lidar and weighted and fused with odometer data by Iterative Closest Point (ICP) algorithm, so as to obtain more accurate pose information. However, these two algorithms will be too traditional due to the implementation method. It is easy to have problems such as low efficiency and poor data fusion effect. In the fusion of laser and attitude measurement sensor. Literature [10] tightly coupled the lidar and IMU fusion, which provided a higher frequency of real-time pose data for the robot, but there was no way to eliminate the invalid data resulting in data redundancy. Literature [11] tries to introduce odometer data to assist IMU for fusion, but there are still some errors. In the fusion of vision and laser sensor, literature [12] fuses laser and camera to realize loop closure detection and complete global map construction, but the stability is not good. Literature [13] uses the fusion of vision and laser based on static scenes to improve the accuracy of mapping, but it cannot achieve dynamic real-time performance. In literature [14], the overall fusion data of odometry, IMU, visual camera and lidar were analyzed, and the feasibility of multi-sensor fusion SLAM was verified, but there were problems such as accuracy decline and partial map missing.
Based on the above literature, it can be seen that SLAM with multi-sensor fusion can provide more accurate pose data, and can improve the accuracy of mapping and navigation. However, there are also many urgent problems to be solved, such as odometer slippage and map fusion defects. In this paper, a new fusion method is proposed, which uses two-dimensional lidar, depth camera, wheeled odometer and IMU unit to carry out multi-sensor fusion, and uses the improved Point- to-line ICP (PL-ICP) algorithm to intercept key frames. To make up for the shortcomings of the common ICP algorithm in data interception and fusion. EFK algorithm is directly applied to IMU to ensure the simplification and effectiveness of data acquisition. Finally, the Bayesian method is used to fuse the data of vision and laser sensors, and the fusion two-dimensional raster map is constructed to make up for the shortcomings of single sensor mapping. This paper aims to improve the accuracy of indoor robot pose information, establish a global raster map with more complete information elements and more complete data, and prove the effectiveness of the fusion in the actual environment.
The position and orientation information of the mobile robot can be calculated by wheeled odometry or inertial odometry, but the odometry information obtained by these two methods has large errors. In fact, the laser odometry calculation method uses multiple sub-images to complete the map. In fact, it is impossible to insert all the point clouds identified in each frame into the submap, so the laser odometry only selects the key frame data for insertion, and the unimportant frame data will be discarded. The data of the traditional wheel odometer is used as the initial iteration value for the laser odometer to solve the robot’s position and orientation information.
In the laser inter-frame matching solution, the Iterative Closest Point (ICP) algorithm can obtain a good matching effect by iterating the initial value without point cloud segmentation and feature extraction, so ICP algorithm has become one of the most studied and most mature algorithms [15].
The basic principle of ICP algorithm is as follows:
Let the set of spatial coordinates of the two-point cloud frames of the laser be: Starting frame
The error function is as follows.
Where
For any
When W has full rank:
In summary, each iteration of ICP algorithm will traverse every point in the origin set until it finds the closest point to the target point [16]. Therefore, the basic process of ICP algorithm is as follows: firstly, for the two laser point clouds that need to be matched, the nearest associated point of the point cloud is found. Usually, the initial corresponding point is obtained by using the data of the wheeled mileage meter. Then based on the corresponding points,
In the ICP algorithm in the previous section, in the process of finding corresponding points, the point with the closest Euclidean distance is considered to be the corresponding point of the point cloud. However, in the actual indoor environment, the distance between the laser point and the actual environmental surface is the best error scale, so the standard ICP algorithm will cause a certain number of wrong corresponding points [17]. To solve this problem, many improved versions of ICP algorithm have been derived. The iterative closest point from point to line improves the error equation of the standard ICP algorithm, and uses the distance between the laser point and the line of the nearest two points of the laser point cloud in the next frame to approximate the real scale relationship, which is more suitable for indoor scenes [18].
Therefore, the error equation of PL-ICP algorithm is as follows.
Where
Therefore, the basic process of PL-ICP algorithm is as follows: firstly, the initial rotation matrix is obtained according to the wheel odometer data, and then the current frame laser data is converted to the reference coordinate system. Then for each point of the current laser frame, the two closest points in the reference frame are found and the error is calculated. The error equation is constructed by removing the point with too large error. Finally,
PL-ICP is an algorithm [18] with high matching accuracy and robustness. Compared with ICP algorithm, PL-ICP algorithm has higher solution accuracy and is more suitable for indoor environment [20]. Therefore, the PL-ICP algorithm will also be used in the laser front-end of this paper to solve the inter-frame matching of the laser. However, it is also relatively more sensitive to the initial value and requires a higher accuracy initial value. Although the wheel odometer has the advantages of high frequency and low negative environmental impact, it also has the problems of low test accuracy and the wheel is prone to deformation and slip, which leads to large errors in the test. If only the data of the wheel odometer is used as the initial value of matching, the algorithm is easy to fall into a local cycle due to the large cumulative error of the wheel odometer.
Since the PL-ICP algorithm has higher initial value requirements than the ICP algorithm, and the wheel odometer will cause more and more cumulative errors with the movement of the robot and tire slip, if only the data of the wheel odometer is used as the initial value of the PL-ICP algorithm iteration, the PL-ICP algorithm may fall into a local loop. Although IMU has the problem of integral pose divergence, its instantaneous pose is accurate. Therefore, extended Kalman filter can be used to fuse wheel odometry and IMU to improve the accuracy of pose, and the fused pose data can be used as the initial value of PL-ICP algorithm to avoid falling into local circulation [21].
The extended Kalman filter realizes the linearization of the nonlinear function by performing the first-order Taylor expansion of the nonlinear function. The basic process is as follows:
The state equation is as follows.
The observation equation is:
Where
Because of
The observation equation is
Because
So the prediction stage of the extended Kalman filter is as follows:
The prior state estimate is:
The covariance matrix of the error is:
correction stage is as follows:
Kalman gain:
Posterior state estimation results:
Update the error covariance matrix:
Therefore, the fusion process of wheel odometer and IMU using extended Kalman filter is as follows: When the observation equation of the first sensor is updated, the state quantity and the covariance matrix of the system are obtained as the system predicted state quantity and the system predicted covariance matrix of the second sensor correction process, and then the updated system state estimation result and the covariance matrix are used as the fused output. And they are used in the prediction process for iteration at the next moment [22]. In order to ensure that the subsequent fusion experiment in the real environment can be carried out smoothly, the algorithm is first tested in the simulation software to verify whether the algorithm can correctly subscribe the information of IMU and odometer node and output the fused odometer data. Fig. 1 shows the simulation fusion calculation diagram of the algorithm under ROS:
Simulation fusion calculation diagram
As can be seen from the arrow pointing in the Fig. 1, the fusion node robot_pose_ekf subscribing the odom topic and imu_data topic of the chassis respectively, and then publishing the fusion result as odom_combined topic and the tf transformation of the robot. Finally, the odom_ekf node converts the odom_combined format and publishes it as odom_ekf topic. The result of the final fusion is shown in Fig. 2:
Extended Kalman filter fusion results
As can be seen from the results in Fig. 2, subscription is available in rviz odometry after successful fusion, where the white arrow represents the fused odometry information, and the red arrow represents the original chassis odometry information. In the simulation software, the movement of the simulation robot can be controlled by the keyboard control node, and the fusion node can output the fused odometry data in real time. The output test of odom_ekf with fused odometry data is shown in FIG. 3:
Integrating odometer data
The above experiments are simulation tests of the fusion algorithm. In order to further verify the effectiveness of the extended Kalman filter for the fusion of IMU and wheeled odometer data, the positioning accuracy of the robot is tested in the real scene as shown in Fig. 4. The starting position in the figure is the position where the robot is turned on and the origin of the world coordinate system. The tile size of the test scene is 65cm×65cm. For the convenience of recording, let the robot move from the starting position in the following figure to the end position in the figure along the red rectangle trajectory.
Experimental scenario
In order to facilitate the comparison of odometer data, the remaining three vertices in the rectangular trajectory except the starting point were selected as the target points during the test, and the odometer data was reserved for three decimal places, and the unit was m. The final test results are shown in Table 1 below:
Experimental Result
Starting point | Target point | Wheel odometry | Error | Fusion odometry | Error |
(0,0) | (1.3,0) | (1.347,0.035) | (0.047,0.035) | (1.324,0.029) | (0.024, 0.029) |
(1.3,-3.25) | (1.391,-3.373) | (0.091,0.123) | (1.386,-3.322) | (0.086, 0.072) | |
(0,-3.25) | (0.152,-3.387) | (0.152,0.137) | (0.085,-3.316) | (0.085, 0.066) | |
Mean error | (0.097,0.098) | (0.065, 0.056) |
It can be seen from the test data in the table above that in the rectangular area trajectory of 1.3m×3.25m, the positioning accuracy of the robot is improved by at least 33% compared with the positioning accuracy of the wheeled odometer by fusing the data of the wheeled odometer and IMU. Since the PL-ICP algorithm requires higher initial values than ICP, poor initial values may cause the iterative solution process to fall into a local cycle, so this paper uses the extended Kalman fusion wheel odometer and IMU data, and takes the relatively more accurate data after fusion as the initial value of the PL-ICP algorithm. To a certain extent, the problem that PL-ICP may fall into a local cycle is avoided.
Because the simple two-dimensional laser SLAM mapping can only scan the information of installation height and lack the information of vertical direction, there may be missing information when scanning and mapping the desk, stool and other items in the indoor environment,such as: When the height of the desktop is higher than the installation height of the lidar, the lidar can only scan the information of the table leg, and there is no obstacle in the middle of the table leg. If the actual height of the robot is higher than the height of the table, the subsequent navigation, because the laser radar can not scan the table, it is determined that there is no obstacle to pass through the middle of the table leg, this situation may lead to damage to the robot. At present, the visual SLAM algorithms based on feature points and direct methods are established in a static environment for measurement, but there are many variables in the real scene, which lead to a serious decline in the positioning accuracy and robustness of the SLAM system, and even lead to the failure of mapping. And pure visual SLAM, due to the limitation of the sensor itself, has low accuracy, is greatly affected by ambient light, and the error of the lack of texture environment is very large. The traditional visual SLAM algorithm does not have the ability to perceive the target in real time, so it is not suitable for simple visual mapping for navigation. However, visual SLAM can scan the vertical information, so the local grid map established by vision and the local grid map established by laser can be fused. On the one hand, it can make up for the lack of vertical information in laser mapping, and on the other hand, it can also improve the low accuracy of visual mapping. The schematic diagram of the scanning range of laser and vision is shown in Fig. 5:
Laser and visual scanning range
The red dotted line in the figure above represents the scanning range of the two-dimensional laser sensor installation position, which can only scan objects in the installation height plane, and the purple dotted line of the cone represents the scanning range of the vision sensor. The Xtion Pro live depth camera used in this paper has a vertical field of view of 45° and a horizontal field of view of 58°.
Since the final established is a two-dimensional raster map, and the depth point cloud information of the depth camera is three-dimensional, it is necessary to project the three-dimensional depth information into two-dimensional pseudo-laser data, and then construct a two-dimensional raster map. Therefore, the final fusion process of the local raster map established by laser and vision is as follows: after the point cloud information is obtained by two-dimensional lidar, the obstacle information obtained by laser is transformed into the raster map coordinate system according to the solved robot pose, and the local two-dimensional raster map of the environment obstacles is formed. At the same time, the data of the depth camera is projected into a pseudo-two-dimensional laser data, which is also transformed into a raster map coordinate system to form a local two-dimensional raster map. Then, the local 2D raster maps of the two are fused to supplement the vertical environmental obstacle information that is not obtained by the 2D laser radar, and a global raster map is formed.
For the fusion of local raster maps, according to the basic principle of raster maps, the Bayesian method is continued to be used for the fusion of raster maps, and the fusion formula [23]:
In the above equation,
2D excitation Optical radar | Depth camera | ||
Occupy | empty | Uncertain | |
Occupied | Occupy | Occupy | Occupy |
empty | Occupy | empty | empty |
Uncertain | Occupy | empty | Uncertain |
If the obtained probability of grid occupation
In the equation,
The robot used in this paper is Handsfree, so firstly, Gazebo 3D simulation software is used to create a simulation robot model with the same sensors as Handsfree platform, as shown in Fig. 6 after the creation is completed:
Robot model
The above figure is the overall model of the simulated robot, in which the red boxes are the simulated Xtion Pro Live depth camera and RPLidar A1 lidar model respectively. At the same time, in order to ensure better follow-up experiments, the same sensor parameters of each sensor of the simulated robot and the real robot are set respectively. Then, a simulation experiment environment is established for the experimental effects that need to be verified by the algorithm in this paper. Obstacles such as four-legged desks, T-shaped tables (the height of the desktop is higher than the installation height of the two-dimensional lidar), trucks (the height of the truck body is also higher than the installation height of the two-dimensional lidar) and solid spheres for convenient comparison are set in the simulation experiment environment. The setting of the truck model is to better show the fusion effect on the final created environment map, and the final created simulation experiment environment is shown in Fig. 7:
Simulation experimental environment
It can be seen from the above figure that the height of the table and the truck body are higher than the installation height of the 2D lidar, so the scanning point of the 2D lidar passes through the table and the truck. If it is a single 2D LIDAR SLAM, it will be considered that there are no obstacles here, which may lead to the collision between the robot and the table and the truck in the subsequent navigation. The effectiveness of the fusion is verified through the simulation experiment of the single two-dimensional laser sensor mapping and the multi-sensor fusion scheme in this paper, and the angular velocity and linear velocity of the robot motion are kept the same during the experiment. The results are shown in Fig. 8:
Comparison of simulation experiments
Through the mapping results of the above figure combined with the simulation experiment environment, it can be seen that the two-dimensional lidar can only scan the object with the installation height, and cannot scan the main body of the truck and the desktop with the installation height higher than the lidar in the simulation environment. For example, the blue range in the simulation environment in Fig. 7 is the scanning point of the lidar. As there is no laser information of the main truck and the desktop in the simulation environment, it will be considered that there are no obstacles here. The final mapping result is shown in Fig. 8 (a), and the complete obstacle map information cannot be established for the table and the truck in the simulation environment. For the fusion mapping, since the depth camera can scan the information of the main body of the truck and the table, and project these point cloud information into a fake two-dimensional laser data for mapping, it is very good to establish the complete information of the table and the truck on the final grid map, as shown in Fig.8 (b) , so as to avoid the risk of collision between the robot and the obstacles in the subsequent navigation.
In order to further verify the effect of fusion, this section will test the algorithm in the real environment of about 70m2 as shown in Fig. 9 below to verify the effectiveness of the algorithm in the actual environment.
Real experimental environment
FIG. 9 (a) shows the overall situation of the experimental environment, and (b) shows the front part of the experimental environment. In (b), there is a row of conference tables on the right, and the height of the gap left under the table top is higher than the lidar installation height of the mobile robot. This is shown in Fig. 10:
Installation height of LiDAR
The red circle in the figure above is the comparison between the height of the lidar and the gap under the conference table. Since the gap height of the conference table is larger than the installation height of the lidar, the information above the conference table will not be built into the final grid map of the environment when the single 2D lidar is used for mapping.
During the experiment, keep the linear speed of the robot at 0.2m/s and the angular speed of the steering at 30° per second. Start at the position shown in Fig. 11, circle around the experimental environment clockwise and return to the origin.
Robot departure position
As a comparison, the Gmapping algorithm based on a single two-dimensional lidar and the fusion algorithm in this paper are used for mapping tests, and the motion speed of the robot is kept unchanged during the experiment, and the results are shown in Fig. 12:
Real environment mapping results
Fig 12 (a) shows the mapping results of Gmapping. It can be seen that part of the information of the conference table is missing in the final raster map, and only the leg information of the conference table is available. In (b), the complete information of the conference table is added to the final raster map, and this information is obtained by fusing the local raster map established by the projection of the depth camera into a pseudo-two-dimensional laser data, so as to verify the effectiveness of the fusion mapping.
In addition, in order to facilitate the recording of the robot’s pose during operation, the timing control is used to release forward and turn commands to the mobile robot chassis to control the robot to reach the specified position. Five points at the same position in the two experiments were selected to analyze the accuracy of the pose, and the results are shown in Table 3 below.
Positioning results
Actual location (m) | Actual pose (°) | Gmapping | Vision + Laser | ||||||
---|---|---|---|---|---|---|---|---|---|
Estimated position (m) | Estimated pose (°) | Root mean square error of position (cm) | Attitude error (°) | Estimated position (m) | Estimated pose (°) | Root Mean square error of position (cm) | Pose Error (°) | ||
(6,0) | 30 | (6.085, 0.079) | 33.149 | 11.604 | 3.149 | (6.059, 0.037) | 31.092 | 5.9 | 1.092 |
(6,-3) | 90 | (6.094,-3.081) | 94.634 | 12.408 | 4.634 | (6.064,-3.051) | 92.043 | 8.184 | 2.043 |
(6,-6) | 90 | (6.103,-6.089) | 95.005 | 13.612 | 5.005 | (6.072,-6.063) | 92.729 | 9.567 | 2.729 |
(0,-6) | 180 | (0.135,-6.112) | 186.024 | 17.541 | 6.024 | (0.089,-6.075) | 184.007 | 11.639 | 4.007 |
(0,0) | 0 | (0.156, 0.127) | 7.678 | 20.116 | 7.678 | (0.091, 0.08) | 4.96 | 12.117 | 4.96 |
According to the results in the above table combined with the mapping results, the Gmapping algorithm based on a single two-dimensional lidar relies heavily on the wheeled odometer, and the cumulative error of the wheeled odometer information increases with the movement of the robot, so when the robot finally moves back to the origin, the established environment map has been misaligned. The root mean square error of the position reaches about 20cm. The fusion algorithm in this paper uses the laser odometer, and uses the extended Kalman filter to fuse the data of the wheel odometer and IMU as the iterative initial value of the laser odometer to obtain a more accurate robot pose, so the overall pose error is significantly smaller than Gmapping, and the root mean square error of the robot’s position is about 12cm when it returns to the origin. The positioning error of the robot is effectively reduced.
In this paper, SLAM based on single 2D LiDAR is analyzed and optimized. Extended Kalman filter is used to fuse wheel odometer and IMU data to provide initial iteration values for the laser interframe matching algorithm PL-ICP, and at the same time, the fusion visual projection is built into a local 2D raster map of pseudo-2D laser data. The results show that compared with the single two-dimensional laser SLAM, the multi-sensor fusion SLAM scheme in the paper can improve the pose accuracy of the robot and build a more complete global map of the environment details.