Acceso abierto

Research on Motion Control for a Mobile Robot Using Learning Control Method


Cite

Introduction

The control system and mechanical parts of spherical mobile robots are encapsulated in a round shell. Using the outer round shell, spherical mobile robots can make themselves move around. Mechanical protection for the electrical equipment and actuation parts is provided by the hard shell. Compared to the wheel and caterpillar track robot, spherical mobile robots have significant advantages such as low friction resistance, high motion flexibility and the capability for omnidirectional movement. It is very important that the spherical robot can maintain or resume stability when collision occurs. Spherical robots are very suitable for use in rugged terrain and harsh environments [1].

Recently, many kinds of spherical mobile robots have been developed and several practical applications are being studied by research institutes and universities [2,3,4,5,6,7,8]. The robot makes contact to the ground with the help of the round shell at a single point, and is driven by the inner actuation mechanism. The precise position control of these kinds of robots is a challenging problem for the application.

Based on learning the human operator control process, an approach of position control for the spherical mobile robot is presented. The contents and framework of the paper is as follows: In section 2, the whole mechanics of the robot is introduced. The theory and usage of support vector machine (SVM) for robot control is introduced in section 3. Using SVM learning control method, the process of human control strategies is parameterized for position control of the robot in section 4. Experiments are implemented to prove the validity of the learning controller in section 5. The last section draws the conclusions.

Description of the robot

The mobile robot mainly consists of a round shell, actuation mechanical parts, inner case, a flywheel and a telescopic camera boom. The shell is made of 10 mm thick organic glass and the diameter is about 600 mm. The actuation mechanism, motors, battery, controller and sensors are all encapsulated in the shell. The motion and standing modes of the robot are given in Figures 1 and 2.

Fig. 1

Motion mode of the robot.

Fig. 2

Standing mode of the robot.

Three separate motors are included in the inner actuation mechanical parts: one flywheel motor, one long-axis serve motor and one short-axis serve motor. The main axes of the short-axis and long-axis motors are perpendicular. During motion, the forward and backward driving forces of the robot are generated by the long-axis motor. The long-axis motor can swing the counterweight directly through the inner case. The angle of the counterweight relative to the axis of the shell can be controlled by the short-axis motor to the yaw angle of the robot. The flywheel motor drives the internal flywheel spinning at a desirable high speed to increase the angular momentum of the robot based on the gyroscopic precession principle.

The mobile robot is powered by a +48V lithium battery and is self-contained. The control algorithm is programmed and performed on a control board made of an ARM9 processor S3C2410. The interface and communication data between the ARM9 control board and the motor controllers follow the CANopen device and communication profile. An inertial measuring unit is installed on the inner case, and it can provide roll, yaw and pitch angles and velocities of the inner case with respect to ground. The motion state of the robot such as rolling velocity, yaw and pose angles and the data required for the learning control are provided by the inertial measuring unit and motor controllers. The operator commands are transmitted by the data wireless radio.

The theory and usage of SVM
Theory of SVM

SVM can perform effective learning tasks such as binary classification and real valued function approximation tasks. SVM learning control can be effectively used for mobile robots [?, 9, 10].

The basic concepts of SVM used this paper are presented by the prior research. Using SVM, the n-dimensional input data X is transferred into the high-dimensional feature space using a nonlinear mapping Φ(x) as in Eq. 1: f(x)=(ωΦ(x))+b f(x) = (\omega \cdot \Phi (x)) + b where ω is the weight vector and b is the threshold. By introducing an ɛ-insensitive loss function, SVM can be used to solve regression problems. The band around true outputs can be defined by the loss function, while errors inside the band (ɛ > 0) can be ignored. Errors outside the band can be measured by slack variables ξi and ξi* \xi _i^* and ω and b can be chosen to solve the optimization problem in Eq. 2 as follows: minω,ξi,ξi*,b12(ωωT)+C1li1l(ξi+ξi*) \mathop {\min }\limits_{\omega ,{\xi _i},\xi _i^*,b} \;\;{1 \over 2}(\omega \cdot {\omega ^T}) + C \cdot {1 \over l}\sum\limits_{i - 1}^l ({\xi _i} + \xi _i^*) subject to: (ωΦ(xi)+b)yiε+ξi,yi(ωΦ(xi)+b)ε+ξi*,ξi,ξi*0. \matrix{ {(\omega \cdot \Phi ({x_i})\; + b) - {y_i}\;\; \le \;\varepsilon + {\xi _i},} \hfill \cr {{y_i} - (\omega \cdot \Phi ({x_i}) + b)\;\;\; \le \;\varepsilon + \xi _i^*,\;} \hfill \cr {{\xi _i},\xi _i^* \ge 0.} \hfill \cr }

In Eq. 2, l is the number of training examples and the constant C is the penalty factor. According to the optimization and Lagrange multiplier method, Eq. 2 can be converted to its dual problem (3) to solve as follows: maxαi,αi*i1l[αi*(yiε)αi(yiε)]12i1li1l(αiαi*)(αjαj*)]K(xi,xj) \mathop {\max }\limits_{{\alpha _i},\alpha _i^*} \;\sum\limits_{i - 1}^l [\alpha _i^*({y_i} - \varepsilon ) - {\alpha _i}({y_i} - \varepsilon )] - {1 \over 2}\sum\limits_{i - 1}^l \sum\limits_{i - 1}^l ({\alpha _i} - \alpha _i^*)({\alpha _j} - \alpha _j^*)]K({x_i},{x_j}) subject to: i1l(αiαi*)=0,0αi,αi*C/l,i=1l. \sum\limits_{i - 1}^l ({\alpha _i} - \alpha _i^*) = 0,\;0 \le {\alpha _i},\alpha _i^*C/l,i = 1 \ldots l.

The polynomial basis function is used as the kernel function K(xi,x) = Φ(xi) · Φ(x) for SVR. This method can avoid computing the transformation. By computing the quadratic programming problem, the solution of Eq. 3 in the low-dimensional space can be gained in terms of the kernel function. The support values are positive Lagrange multipliers α¯ \bar \alpha and α¯* {\bar \alpha ^*} : f(x)=(ωΦ(x))+b=SV(α¯α¯*)K(xi,xj)+b¯ f(x) = (\omega \cdot \Phi (x)) + b = \sum\limits_{SV} (\bar \alpha - {\bar \alpha ^*})K({x_i},{x_j}) + \bar b Here: b¯=yij(α¯α¯*)K(xi,xj)ε,αi(0,C/l),b¯=yij(α¯α¯*)K(xi,xj)+ε,αi*(0,C/l). \matrix{ {\bar b = {y_i} - \;\sum\limits_j (\bar \alpha - {{\bar \alpha }^*})K({x_i},{x_j}) - \varepsilon ,\;\;{\alpha _i} \in (0,{\rm{C}}/l),} \hfill \cr {\bar b = {y_i} - \;\sum\limits_j (\bar \alpha - {{\bar \alpha }^*})K({x_i},{x_j}) + \varepsilon ,\;\;\alpha _i^* \in (0,{\rm{C}}/l).} \hfill \cr }

Usage of SVM

During continuous human operation, the control process is viewed as a discrete time sampling process and can be depicted by the following equation: x(t+1)=fx(x(t),uh(t)). x(t + 1) = {f_x}(x(t),{u_h}(t)). Here X = [x1,x2...xn]T are the state vectors of the robot, u(t) are the vectors of control input and fx = [f1, f2... fn]T are the unknown nonlinear functions.

Using sample dates during the human control process, the learning controller can be constructed by the support vector regression (SVR) method. Using the current state of robot and control input usvm(t), the controller can simulate the human control strategy and predict the control input for the robot as shown in Figure 3.

Fig. 3

The SVM learning control diagram.

The difference Eq. 5 can approximately describe Figure 3 as follows: {x(t+1)=fx(x(t),usvm(t))usvm(t+1)=fu(x(t),usvm(t)) \left\{ {\matrix{ {x(t + 1) = {f_x}(x(t),{u_{svm}}(t))} \hfill \cr {{u_{svm}}(t + 1) = {f_u}(x(t),{u_{svm}}(t))} \hfill \cr } } \right. If X = [xT, usvm]T, f=[fxT,fu]T f = [{f_x}^T,{f_u}{]^T} , then: X(t+1)=f(X(t)) X(t + 1) = f(X(t)) Furthermore, if X^=[x^T,usvm]T \hat X = [{\hat x^T},{u_{{\rm{svm}}}}{]^T} and f^=[f^xT,f^u]T \hat f = [{\hat f_x^T},{\hat f_u}{]^T} , the following Eq. 7 is an estimation for Eq. 6 as follows: X^(t+1)=f^(X^(t)) \hat X(t + 1) = \hat f(\hat X(t))

Design of learning controller based on SVM

The human operator can be good at operating the movement of mobile robots. The learning control method in this paper is to model the human operator control process. Based on the synchronous states of the robot, the most likely command is selected to represent the human operator control behaviour [9, 10].

The human control strategy can be measured as a stochastic process. The human control strategy can be mapped between the states of the robot and the operator input commands: First, the human control output data and current states of robot is gathered; second, the SVM learning method is used to model the human control operating and the human control parameters are stored for the task; third, the learning controller is set up by the offline learning computing; and finally, the learning controller is implemented on the central controller in the robot. This method can simulate the control strategy of the human operator to the robot as described in Figure 4.

Fig. 4

Diagram for the learning control strategy.

The SVM-based learning control strategy learning procedure can be divided into three stages: Training sample gathering stage, SVM off-learning stage and control strategy realization stage. Every step of the SVM-based learning control strategy is summarized as follows in detail, and the entire control diagram is as described in Figure 5:

Human operator control of the robot finishes the assigned position motion, and in this procedure, the human control input and the states of the robot, including angular velocity of the shell and inner frame, leaning angular and so on, are collected.

Choosing the polynomial kernel and an SVM learning machine for characterizing the human control strategy.

Using SVR method mentioned in section 3 and the state data gathering above, calculating the support vector and gaining the function of learning controller y(x) as in Eq. (5).

Encoding and implementing the control strategy on the central controller of the robot. Measuring the runtime of the controller, and if the runtime is too long, then, return to step 1 and relearning.

Finally, by means of experiment, verifying whether the position control task can be achieved by this learning controller; if not, return to step 1, and re-learning.

Fig. 5

Procedure of the learning control.

Experiment

The learning control algorithm is programmed in the ARM9 control board to test and verify the learning control strategy. The experiment is implemented on the flat plastic runway. First, the human operator uses the hand-joy stick to control the robot, and different sensor data for SVM learning is obtained from the motor controller and IMU through the CAN bus wire. Figure 6 shows the experiment environment.

Fig. 6

Hand-joy stick and experiment environment.

In the experiment, the control period is 100 ms. First, the human operator controls the robot to finish the linear displacement of 10 m by hand-joy stick, while the final and initial velocities of the robot are zero. During this process, the velocity of the shell ωshell and human control input uh is collected as the training sample. The number of samples is about 1,050.

The angular displacement of the shell θshell can be gained by the integral result of ωshell. We use θshell and uh as the training input vector X(t) of the learning controller, and ( θ^shell {\hat \theta _{shell}} , uSVM) as the training output vector X(t + 1); Vapnik’s polynomial kernel K(xi,x) = ((xi · x) + 1)2 is the kernel function. After the sampling and calculation, for θshell and uh, 38 and 32 support vectors, respectively, with the corresponding α¯ \bar \alpha and α¯* {\bar \alpha ^*} are obtained.

The error of the SVM-based learning ΔuSVM is shown as in Figure 7 and the maxim error is not larger than 0.005.

Fig. 7

Error of learning-based learning.

After the encoding learning controller on the central ARM9 processor of the robot, the robot can achieve the linear displacement motion smoothly driven by the motor driver, which is directed by the output of learning controller.

The result of 10 times of repeating the experiment is 10.13, 9.75, 10.32, 10.27, 9.85, 10.15, 10.36, 10.21, 9.88 and 10.06 m. The maxim error of the linear displacement is about 32 cm, and the error is mainly generated by the change in the friction and lean of ground or other random things, which may cause the robot to depart from the straight line.

The experiment result shows that the usage of the learning-based controller can achieve the goal of the precise linear displacement for the mobile robot.

Conclusions

This paper presented an SVM-based learning position control strategy for an omnidirectional rolling mobile robot. The mechanical and motion principle of the robot is described. The theory and implement of SVM for robot control is introduced. A learning controller which simulates the human control strategy is designed for the position control of this robot. The feasibility of the learning control method is validated by several experiments on plastic ground. The effectiveness of the position control methods is implied by the results of the experiment.

This learning position control strategy relies on the experience model. The error in computation can arise due to uncertainty and change in the environment. In order to realize the field exploration by spherical robots, this proposed learning control methods needs to be enhanced.

eISSN:
2444-8656
Idioma:
Inglés
Calendario de la edición:
Volume Open
Temas de la revista:
Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics