Open Access

Analysis and synthesis of function data of human movement


Cite

Introduction

Human behaviour recognition is a method to realise the computer's understanding and description of human behaviour. It is a critical technology in related fields such as video surveillance, competent nursing, and human-computer interaction. Periodic motion data sequences obtained through wearable motion capture systems often contain components, including hundreds of data points. To reduce the amount of calculation and eliminate the interference of redundant information, window processing is usually used to cut the data. The human body movement itself has continuity, and this applies for any daily human behaviour. The movement of hands, feet or other body parts presents a continuous process [1]. Most of the existing methods adopt direct feature extraction of discrete data collected by sensors, ignoring the study of the continuity of the movement data itself. This article attempts to take the discrete data continuity and then find the implicit continuous feature information that is closer to the essence of motion from the continuous data.

Human behaviour actions are divided into approximately periodic behaviours and non-periodic behaviours. Approximately periodic behaviours refer to exercise behaviours that are repeatedly performed multiple times. The daily behaviour of the human body often has periodic characteristics, and the data sequence of daily behaviour collected by sensors is processed at the same time. It also presents approximate periodic characteristics [2]. This article considers converting this discrete motion data sequence into a periodic continuous function, which provides a theoretical basis for the study of the continuity characteristics of motion data and can accurately express the periodic characteristics of motion data. Therefore, this article introduces a discrete data functional processing method, that is, functional data analysis (FDA). The basic idea is to treat the observed sample as a discrete data sequence and treat the observed data as a whole and express it to analyse the sample using continuous data processing methods. At present, FDA methods are mainly used in several economic data processing activities, and research on sports data is also involved.

After the feature extraction of the motion data, the appropriate classifier algorithm needs to be selected. The classifier methods commonly used for human behaviour recognition mainly include support vector machine (SVM), decision tree (DT), artificial neural network (ANN), and sparse classification (SRC). Among them, SVM can better solve practical problems such as small samples, non-linearity and high dimensionality. For the data types in this article, SVM is used as the classifier. The difference in the data characteristics within the period is used to classify and recognise the dynamic behaviour of the human body.

Theoretical analysis
FDA

The FDA is a statistical method that has emerged in recent years. It is a development and extension of traditional statistical analysis methods. The FDA's application of exercise data research is reasonable. For an observation data sequence y = (y1,y2, …, yn)T, it is possible to establish a model yi=x(ti)+i,i=1,2,,n {y_i} = x({t_i}) + { \in _i},i = 1,2, \cdots ,n Among them, x(ti) represents the value of the observation data sequence's function and ∈i, represents the noise. To estimate the value of x(ti) in Eq. (1), the basis function expansion method is used, i.e. a set of basis functions Φ(t) = (φ1(t),φ2(t), …, φK(t)) is selected to transform the discrete data into a linear combination of basis functions, namely x(ti)=k=1Kckϕk(ti),i=1,2,,n x({t_i}) = \sum\limits_{k = 1}^K {c_k}{\phi _k}({t_i}),i = 1,2, \cdots ,n After the basis function is determined, a set of values of the coefficient vector c = (c1,c2, …,cK)T uniquely determines a function. The most direct method to solve the coefficient vector is the least square method to minimise the residual sum of squares. SMSS(y|c)=i=1n[yik=1Kckϕk(ti)]2 SMSS(y|c) = \sum\limits_{i = 1}^n {\left[ {{y_i} - \sum\limits_{k = 1}^K {c_k}{\phi _k}({t_i})} \right]^2} For periodic observation data, the Fourier series is often used as the basis function, namely ϕ1(t)=1,ϕ2(t)=sin(ωt),ϕ3(t)=cos(ωt),,ϕ2k(t)=sin(kωt),ϕ2k+1(t)=cos(kωt) {\phi _1}(t) = 1,{\phi _2}(t) = \sin (\omega t),{\phi _3}(t) = \cos (\omega t), \cdots ,{\phi _{2k}}(t) = \sin (k\omega t),{\phi _{2k + 1}}(t) = \cos (k\omega t) Therefore, the Fourier expansion of the function x(t) of the observation data sequence is x(t)=c1+c2sin(ωt)+c3cos(ωt)++c2ksin(kωt)+c2k+1cos(kωt) x(t) = {c_1} + {c_2}\sin (\omega t) + {c_3}\cos (\omega t) + \cdots + {c_{2k}}\sin (k\omega t) + {c_{2k + 1}}\cos (k\omega t) Among them is ci(i = 1,2, …,2k + 1).

Support Vector Machine

SVM is a machine learning method. Its essence is a linear classifier that maximises the interval in the feature space. The learning strategy is to find a hyperplane in the feature space that maximises the sample interval, i.e. to solve a convex quadratic programming problem in the process [3]. The crucial part of SVM is to solve the following optimisation problems min12ω2+Ci=1nξis.t.y(ω.xi+b)1ξi,ξi0i=1,2,,n \matrix{ {\min {1 \over 2}{{\left\| \omega \right\|}^2} + C\sum\limits_{i = 1}^n {\xi _i}} \cr {s.t.y(\omega .{x_i} + b) \ge 1 - {\xi _i},{\xi _i} \ge 0} \cr {i = 1,2, \cdots ,n} \cr } Among them, ξi is called the slack variable, and C is the penalty parameter. The kernel function is one of the critical factors of SVM. This article mainly adopts the SVM model with the radial basis function as the kernel function, and the form is as follows: K(x,x1)=exp[xx122σ2] K(x,{x_1}) = \exp \left[ { - {{{{\left\| {x - {x_1}} \right\|}^2}} \over {2{\sigma ^2}}}} \right] where σ is the core radius.

The algorithm description of this article
Fourier series fitting
Determination of the fitted model and the frequency of the fitted function

The algorithm in this paper is mainly aimed at the periodic behaviour data of the human body collected by the wearable motion capture system. Since the human body motion has continuity and periodicity, it is necessary to functionalise the discrete motion data to obtain the continuous characteristics of the human body motion.

We use k-order Fourier series fitting for this periodic discrete observation data, as shown in Eq. (5). In the equation, ω is the frequency of the fitting function recorded as ωfunc and the frequency of the fitting function. The period Tfunc should be as close as possible to the frequency ωdata and period Tdata of the original data, while maintaining the individual difference of each fitting function. We estimate the fundamental frequency ω to obtain the fourth-order Fourier series fitting function of the image. The formula is as follows: x(t)=c1+c2sin(ωt)+c3cos(ωt)++c8sin(4ωt)+c9cos(4ωt) x(t) = {c_1} + {c_2}\sin (\omega t) + {c_3}\cos (\omega t) + \cdots + {c_8}\sin (4\omega t) + {c_9}\cos (4\omega t) Experiments justify the use of the Fourier series fitting in the present study, since it can better reflect the periodic and local characteristics of the original data, remove isolated points and abnormal points and complete the smooth denoising of the original data sequence [4]. Simultaneously, compared with the results for discrete motion data, the functionalised result better reflects the continuity of human body motion and motion data and facilitates the precise selection of a cycle starting point of the motion cycle, thereby facilitating data alignment.

Fourier series fitting applied to motion data

A single sensor is at the time ti (i.e. at the ti sampling); a vector like the following is generated: a(ti)=(a1(ti),a2(ti),,aM(ti)) a({t_i}) = ({a_1}({t_i}),{a_2}({t_i}), \cdots ,{a_M}({t_i})) Among them, M represents the total number of sensor measurement units. A single sensor performs continuous sampling to generate an observation matrix as follows: s=[α(t1),α(t2),α(tn)]T s = [\alpha ({t_1}),\alpha ({t_2}), \cdots \alpha ({t_n}{)]^T} Among them a(ti),i = 1,2, …,n is the ti sampling data of a single sensor. Therefore, Eq. (10) can be expressed as s=[a1(t1)a2(t1)aM(t1)a2(t2)a2(t2)aM(t2)a2(tn)a2(tn)aM(tn)]n×M s = {\left[ {\matrix{ {{a_1}({t_1})} & {{a_2}({t_1})} & \cdots & {{a_M}({t_1})} \cr {{a_2}({t_2})} & {{a_2}({t_2})} & \cdots & {{a_M}({t_2})} \cr \vdots & \vdots & \ddots & \vdots \cr {{a_2}({t_n})} & {{a_2}({t_n})} & \cdots & {{a_M}({t_n})} \cr } } \right]_{n \times M}} Fourier series fitting is used for each column of the observation matrix. The data sequence of each measurement unit of a single sensor is converted into a function about the sampling time t. Therefore, Eq. (11) is transformed into the following function vector: s*=(f1(t),f2(t),,fM(t)) {s^*} = ({f_1}(t),{f_2}(t), \cdots ,{f_M}(t)) For a given sample of motion behaviour, the sample data is collected by multiple sensors at the same time; thus, there is the following observation function matrix: S=[f11(t)f21(t)fM1(t)f12(t)f22(t)fM2(t)f1N(t)f2N(t)fMN(t)]N×M S = {\left[ {\matrix{ {f_1^1(t)} & {f_2^1(t)} & \cdots & {f_M^1(t)} \cr {f_1^2(t)} & {f_2^2(t)} & \cdots & {f_M^2(t)} \cr \vdots & \vdots & \ddots & \vdots \cr {f_1^N(t)} & {f_2^N(t)} & \cdots & {f_M^N(t)} \cr } } \right]_{N \times M}} N represents the total number of sensors, and M represents the number of measurement units for each sensor.

Cycle start point selection

According to different types of behavioural data, different sample function matrices are obtained. The morphological curve characteristics of the corresponding functions in different sample function matrices are different. This difference in morphology is the basis for the classification of the sample set, and this difference can be reflected in one exercise cycle, as shown in Figure 1. Therefore, only one cycle of data can be extracted to distinguish different behaviours.

Fig. 1

Data curve during the walking exercise cycle.

Selecting the starting point of the period through the element function of the matrix in Eq. (13) is essential for extracting period data. The method adopted in this paper is to calibrate the starting point of a period through the information of the derivative function of the fitting function (as shown in Figure 2) and then find a motion cycle. From the sample function matrix of Eq. (13), we select one of the element functions, abbreviated as f (t), and find its characteristic point. It is reasonable to use the characteristic point as the starting point of the cycle. This paper selects f (the set of extreme points of t); that is, all the zero points in the function image in Figure 2 are used as the function's feature point set. Then the point in the feature point set that makes f (t) obtain the minimum is selected as the starting point of the motion cycle. Finally, each element function in the sample function matrix uses this starting point as the starting point of the motion cycle.

Fig. 2

Derivative function graph.

We choose the minor extreme point as the starting point of the cycle and achieve data alignment. From the perspective of the function, since there are multiple extreme points in the same function, the minor extreme point is selected to unify the standard so that the primary function represents the different functions that are aligned at the same starting point, thereby facilitating comparison of the data of different shapes and changing trends in a period. In a practical sense, the minimum extreme point of the vertical acceleration function of the foot is the minimum acceleration value during the movement. This point in time often corresponds to when a person's foot is raised to the highest point, and the periodic behaviour based on foot movement includes this movement process [5]. Therefore, we use this point as the starting point of the movement. All sample functions of the same action are aligned here, and periodic functions of different behaviours can be aligned at this starting point.

Extraction of feature vectors

After the starting point of the period is calibrated, a period of data can be extracted according to the fitting function. In this paper, the fitting function is resampled to obtain approximately one period of motion data to express the curve characteristics of the function in one period. The sampling frequency, the time for an ordinary person to complete a motion cycle and the sampling interval of resampling directly determine the length of the resampling period sequence [6]. Assuming that the sampling frequency of the sensor is F, the time for an ordinary person to complete a motion cycle is Tdata (estimable), and each element function of the sample function matrix S is discretised. In other words, the function of each measurement unit of the sensor determines the period of resampling, which starts at the starting point t*; and the periodic data of the kth measurement unit can be expressed as υk=(ak(t*),ak(t*+d),,ak(t*+l1)) {\upsilon _k} = ({a_k}({t^*}),{a_k}({t^*} + d), \cdots ,{a_k}({t^*} + l - 1)) Among them k = 1,2, …,M,d is the sampling interval, l represents the period length (the number of resampling points) when the sampling interval d = 1, and l is determined by the sampling frequency F and the motion cycle time Tdata, i.e. l = F × Tdata. The feature vector is extracted for each behaviour sample. First, according to Eq. (14), a vector is constructed in the form of Eq. (15) for each sensor, as follows: Vj=(υ1,υ2,,υM),j=1,2,,N {V^j} = ({\upsilon _1},{\upsilon _2}, \cdots ,{\upsilon _M}),j = 1,2, \cdots ,N N represents the total number of sensors, and M represents the total number of measurement units for each sensor. Then the Vj of all sensors can be arranged into the following row vector: V=(V1,V2,,VN) V = ({V^1},{V^2}, \cdots ,{V^N}) In Eq. (16), we arrange the periodic data sequence of all measurement units of all sensors in a sample into an N × M × [l/d] dimensional vector, which is used as the feature vector of the sample for subsequent classification and recognition.

Classification based on feature vectors

This paper uses the SVM method with the radial basis function as the kernel function for classification. For the radial basis kernel function, the parameter C in Eq. (6) and the parameter σ in Eq. (7) are essential for the SVM. The influence is more significant, but the value of the parameter is often judged based on experience. This paper uses the K-fold cross-validation (K-CV) method to select the optimal parameters. This method can effectively avoid the occurrence of over-learning and under-learning and thus obtain an ideal classification model.

Experimental design
Experimental environment and data sources

This article uses Windows 7 64-bit operating system and Matlab software to conduct experiments. The data set was used from the wearable sensor behaviour recognition database (WARD) provided by the University of Berkeley in the United States. Sensors are bound to five parts of the wrist, waist, left ankle, and right ankle [7]. Each sensor is composed of a three-axis accelerometer and a two-axis gyroscope, i.e. composed of five measurement units. The sensor sampling frequency is 30 Hz. The data constituting the set encompasses 13 types of daily behaviours. When collecting data, 20 collectors (13 men and 7 women) must do each action five times. Therefore, the data set includes a total of 1,300 samples. This article selects 10 types of dynamic behaviours, with 1000 samples; see Table 1.

Behaviour description in WARD

Numbering Behaviour category Behaviour description

1 Walk normally Walk forward for >10 s
2 Walk counterclockwise Go counterclockwise for >10 s
3 Walk clockwise Clockwise for >10 s
4 To the left Turn left on the spot for >10 s
5 Turn right Turn right on the spot for >10 s
6 Go up the stairs Go up >10 stairs
7 Down the stairs Go down >10 stairs
8 Jogging Jogging lasts >10 s
9 Jump Jump in place >5 times
10 Push wheelchair Push the wheelchair for >10 s

WARD, wearable sensor behaviour recognition database.

Data preprocessing and cycle extraction

WARD uses five sensors to collect data. Each sensor has five measurement units (a three-axis accelerometer and a two-axis gyroscope can measure acceleration in three directions and angular velocity in two directions). Therefore, Eq. (13) gives the values of both N and M as 5.

When the sampling frequency F = 30 Hz, good experimental results have been obtained by taking 45 points in the sensor sequence, and the time of 1.5 s is close to the time to complete a behavioural action. In this article, the time for an ordinary person to complete a motion cycle is Tdata = 1.5 s. In Eq. (14), F takes the value 30, d takes the value 1 and l = F × Tdata = 45. In summary, the value of 45 points in this paper is used as the data length of a period.

Therefore, the length of the eigenvector in Eq. (16) is N ×M ×[l/d] = 1125. To calibrate the starting point of the period, it is necessary to select one of the element functions in the sample function matrix of Eq. (16) as the starting point of the period, and the starting point of the period as the starting point for the resampling of each element function in the sample function matrix. Since the acceleration function of the human body in the vertical direction can better reflect the difference of different behaviours and steps, the starting point calibrated by this function is the person in the movement process, at the time when the foot is raised to the highest point [8]. Figure 3 is a comparison between the left foot data and the waist data. From Figure 3, it can be seen that the acceleration data change in the vertical direction of the footsteps is more evident than the data change in the vertical direction of the waist. Therefore, this paper uses the vertical acceleration function of the left ankle sensor in the WARD data set as the function of the starting point of the calibration cycle, which is more accurate.

Fig. 3

Comparison of left foot data and waist data.

Figure 4 shows the result of selecting a sample of the collector walking normally, extracting the acceleration data in the vertical direction of the sensor at the left ankle (that is, the data of a single measurement unit), and performing periodic extraction. Figures 4(a) and 4(b) show that the algorithm in this paper achieves a relatively accurate periodic extraction of the motion data function.

Fig. 4

Period extraction results of a single measurement unit.

Figure 5 shows all samples of normal walking behaviour and upstairs behaviour. The acceleration data in the vertical direction of the left foot sensor is periodically extracted [9]. It can be seen from Figure 5 that the sample data are aligned at the uniform starting point of the cycle, the data length is approximately one cycle and the periodic data of different behaviours have morphological differences.

Fig. 5

Periodic extraction results.

Experimental results

In the simulation experiment, we adopted the K-CV method to select test samples and training samples. During the experiment, we divided the data set into K = 20 groups. Each time, 1 group was selected as the test sample, and the remaining 19 groups were used as the training sample, giving rise to a total of 20 experiments. Experiments show that the algorithm in this paper has a recognition rate of 97.5% for periodic human behaviours based on the WARD data set. The specific results are shown in Table 2.

Confusion matrix of 10 dynamic behaviour categories

1 2 3 4 5 6 7 8 9 10 Recognition rate (%)

1 97 0 0 0 0 1 0 0 0 2 97
2 0 98 0 1 0 0 0 0 0 2 98
3 0 0 100 0 1 0 1 0 0 0 100
4 0 0 0 100 0 0 0 0 0 0 100
5 0 0 0 0 98 0 0 0 0 0 98
6 0 0 0 0 0 98 1 1 1 0 98
7 0 0 0 0 0 2 94 1 0 3 94
8 0 0 0 0 0 0 0 99 0 0 99
9 0 0 0 0 0 1 5 0 100 0 100
10 1 0 0 0 0 0 0 0 1 100 100
Total recognition rate 98.4
Algorithm comparison

Table 3 and Figure 6 show the results of simulation experiments with different algorithms in the same environment using the WARD data set in Table 3. As shown in Table 3 and Figure 6, in terms of algorithm recognition rate, the recognition rate of the algorithm in this paper is high. It is similar to traditional algorithms [10]. Therefore, it cannot be compared with traditional classification methods in terms of algorithm performance.

Comparison of algorithm results based on WARD

Window size Feature extraction algorithm Classifier Average time spent on classification (s) Recognition rate (%)

Traditional algorithm 1 200 Time-frequency domain characteristics DT <1 92.6
Traditional algorithm 2 45 LPP DSC N/A 85.1
Traditional algorithm 3 40 RP SRC N/A 81.6
Algorithm 45 Periodic extraction of calibration starting point SVM <1 98.9

DT, decision tree; LPP, partially preserving projection; RP, random projection; SRC, sparse classification; SVM, support vector machine; WARD, wearable sensor behaviour recognition database.

Fig. 6

Comparison of algorithm results based on WARD. LPP, partially preserving projection; RP, random projection; WARD, wearable sensor behaviour recognition database.

The choice of the classifier is significant for behaviour classification. Table 4 is the result of classification using different classifiers (SVM, DT, naive Bayes, K-nearest neighbour) after extracting the feature vector according to the algorithm of this paper. From Table 4, It can be seen that based on the feature vectors extracted by the algorithm in this paper, the SVM method can achieve a higher recognition rate.

Comparison of classification results of different classifiers

Classifier Recognition rate (%)

SVMs 98.9
DT 89.5
Naive Bayes 86.9
K-neighbours 93.9

DT, decision tree; SVM, support vector machine.

Experiments based on self-collecting database
Introduction to self-collection database

To further verify the rationality and scalability of the algorithm in this paper, this paper uses a motion capture system to record data under actual conditions. It applies the algorithm to the data set. Corresponding to the WARD data set, the database in this paper collects eight collectors (age 17–50 years old, four men, four women), eight daily behavioural actions (upstairs, downstairs, regular walking, clockwise walking, counterclockwise walking, jogging, turning left, turning right) actions, capture the data and extract the sensor data of the left wrist, right wrist, waist, left ankle and right ankle [11]. The data sample of the self-collected database is shown in Figure 7. Due to the data of the WARD data set, the sampling frequency is low (30 Hz), while the motion capture system used in this article has a high sampling frequency (90 Hz). It can be seen from Eq. (14) that if the resampling interval d of the fitting function is kept at 1, it will lead to the period data length being too long and the feature vector dimension being too large. To solve this problem, when the algorithm in this paper is applied to a self-collecting database, the sampling interval d in Eq. (13) is set to 3 (the sampling frequency of the database in this paper is three times that of WARD). Then we resample the fitting function and extract the feature vector. This adjustable resampling method reflects the applicability of the algorithm in this paper to the existing high-frequency motion capture system and reflects the advantages of functional data.

Fig. 7

Single sample data sequence.

Experimental results

Analogous to the experimental approach of the WARD data set, we used the K-CV method to establish a training sample and test sample library (the self-collected data includes the data of eight experimenters [12]; therefore, the value of K here is 8, which is the 8-fold cross-validation method); and finally, we used the SVM for classification and recognition, and the recognition rate reached 98.75%.

Conclusion

Compared with the traditional discrete motion data processing method, this paper uses the FDA method to functionalise the discrete data to express the continuity characteristics of human motion data. It accurately calibrates the motion cycle according to the function of the motion data. Determining the starting point, and then approximately extracting the periodic sequence of the motion data based on the starting point, has certain advantages over the traditional window processing method. Thus, this article uses SVM and the periodic data sequence extracted by the algorithm discussed here to better classify and recognise the dynamic behaviour of the human body. The recognition rate compared with the traditional method is obtained. The algorithm's effectiveness in this paper has been verified on the public data set WARD and the self-collecting database.

eISSN:
2444-8656
Language:
English
Publication timeframe:
Volume Open
Journal Subjects:
Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics