Research on aerobics training posture motion capture based on mathematical similarity matching statistical analysis

With the rapid development of human motion simulation and virtual reality, motion capture technology has been widely used in the fields of film and television production, medical rehabilitation, animation games and other fields, achieving the perfect combination of three-dimensional virtual world and reality technology. In recent years, researchers have conducted extensive research on the application of motion capture. Combining motion capture technology with Chinese puppet shows, a digital technical solution for puppet show is proposed. The motion capture system was used to study the hip and torso movement of the golfer during the swing, and the relationship between the movement of the torso, hip and swing was calculated, which provided theoretical support for the scientific training of golf. A virtual basketball training system was developed using motion capture technology, and users can train free throw courses by themselves [1,2,3]. An animation production method and process based on motion capture technology are proposed, and the techniques of animation composition and elimination of slippage in the animation production process are described. According to the normal range of motion of human joints, the motion postures of the main joint points of the human body are studied, and a method of human motion posture simulation is proposed. This method can use human motion data to drive a virtual human body model. Motion capture technology can help trainers to correct the movement of trainees by analysing and processing the motion data in real time. The traditional comparison method generally uses the Euclidean distance method to calculate the distance between two moving nodes to determine the similarity of the data of the two channels. However, this method requires equal length action time series, and the coordinate displacement caused by the trainees’ height, fat and thinness will make the calculation result shift greatly [4].

In this paper, an optical motion capture system is used to track and detect the dancer's dance movements in segments, and a real-time analysis method of moving human poses based on similarity matching between feature planes is proposed. It not only improves the efficiency of posture analysis but also improves the teaching quality in real-time analysis, which has played an important role in the digital teaching of dance movements and scientific development.

3D data acquisition for motion capture

The motion capture system is a computer that processes data directly, sets up trackers in key parts of moving objects, uses a motion capture system to record the movement of objects and then processes them to obtain three-dimensional spatial data [5]. Motion capture can be divided into four types: mechanical motion capture, acoustic motion capture, electromagnetic motion capture and optical motion capture. The optical motion capture system gives the performer enough space to perform freely, without being restricted by space and mechanical equipment, and can capture high-speed motions or objects. This paper uses the optical motion capture system to obtain 3D motion data. It can be used to build models and skeleton databases, which fully implements high-precision real-time 3D motion data capture. The process is shown in Figure 1. Then, the 3D space coordinates of key points are used to obtain the motion of the model characteristics and finally realise the analysis of human posture.

2.1

Near-infrared high-sensitivity digital camera captures human motion

The experimental research mainly uses an optical motion capture system, which uses a dedicated near-infrared high-sensitivity digital camera distributed in space to capture the movement of the performer in real time and transmits the captured motion to a computer to generate a virtual object. In the laboratory scenario, as shown in Figure 2, this article uses eight high-precision cameras connected to the DIMS controller with cables. The camera resolution is 5.03 million pixels, the maximum acquisition frequency is 120 fps and the common acquisition frequency is 60 fps. The geay level is 10 bits parallel, equipped with a standard 3 to 18 mm variable focal length lens, using a marker ball with a diameter of 25 mm for human limb capture, and the human motion range is 3.0 m × 3.0 m × 3.5 m [6].

2.2

3D motion data capture

Acquire data using an optical motion capture system. First, performers are required to wear monochromatic clothing with 21 markers on key parts, stand in a preset motion space, start high-precision 3D motion capture software, set a specified time and shoot a specified dance as required. For action, use the camera to capture and track the movement of 21 marker points, as shown in Figure 3.

An actor in a monochrome costume with a marker.

Then save the captured 3D motion data, output the animation file in data editing software and save it as a TRC file. Finally, use Motion Builder to match the human motion data of the 21 marker points captured and edited with the actor model, activate the 21 marker point data to complete the matching with the actor and save the model in FBX format to complete the human motion posture. The database is established, as shown in Figure 4.

Human motion pose analysis based on feature vector matching

Human motion analysis is an important content in the field of computer vision research, and it is also the forefront of computer researchers’ attention. The simulation of the actions of real performers needs to be supported by data. Human motion analysis is to capture, track and analyse the movement of the human body in some way to obtain relevant motion parameters. Applying motion analysis to teaching can not only establish a unique and personalised teaching system but also perform detailed decomposition of the performers’ movements and demonstrate each dance step by step, which is convenient for quantitative analysis and provides scientific assistance for sports teaching help [7].

This paper proposes a human motion pose analysis method based on similarity matching between feature planes. The specific process is shown in Figure 5. The main steps of the analysis process include the following:

Get skeleton data in real time. The dance motion sequence is acquired in real time by means of optical motion capture, and the coordinates of each landmark point of the human body model in the world coordinate system are stored.

Posture analysis. Determine 7 feature planes based on feature points, extract 13 feature vectors and calculate human poses based on the motion characteristics of key parts of the dance action feature correlation coefficient.

Analysis of feature attitude difference. Use the calculated feature correlation coefficients to analyse the differences and accuracy of students’ dance moves.

3.1

Get skeleton data in real time

The analysis of human motion posture based on a motion capture system is to estimate the human motion posture from one or more perspectives. In order to analyse the posture data of dance movements, this paper uses feature points to calibrate the joint points of the human body and treats human bones as a multi-rigid body connection model. The connection between the feature points represents a rigid body; so, the adjacent joint points will not be deformed. The main action postures are mainly the movements of the head, torso, hips and limbs, as shown in Figure 6 [8].

The movement of the human body is a complicated process. Without considering the conditions of the muscles and nervous system, the human movement can be abstracted into a simple chain system movement connected by some rigid bodies, as shown in Figure 7. Colour joint is used to represent the absolute position coordinates of the feature point in the world coordinate system. The upper limb is composed of two rigid bodies connected to the upper and lower arms of the elbow joint; the lower limb is composed of two rigid bodies connected to the thigh and calf connected by the hip joint. It is also connected by the knee joint; the head, torso and hip also use a line of joint points to represent a rigid body [9].

Through the rigid body structure of the human body described above, the spatial coordinates of the landmark points are used as the data basis for calculating the vector, which is used as the basis for the subsequent similarity matching analysis of the feature plane.

3.2

3D human motion similarity matching algorithm

In pattern recognition and computer vision, the problem of similarity measurement is used to measure the difference or similarity between different objects and the movement at the marker point.

In the trajectory, the movement data curves of the marked points are independent of each other. The two sets of motion data are compared by means of direct comparison. The traditional Euclidean distance measurement method is used to calculate the similarity between the marked points. It cannot be true and reflects the accuracy of the comparison [10].

3.2.1

Similarity matching of traditional 3D models

Based on the Euclidean distance comparison method, for two-dimensional and three-dimensional spaces, the Euclidean distance calculation formulas are as follows: (1) $D = sqrt ({(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2})$ D = sqrt\left( {{{\left( {{x_1} - {x_2}} \right)}^2} + {{({y_1} - {y_2})}^2}} \right) (2) $D = sqrt ({(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2} + {(z_{1} - z_{2})}^{2})$ D = sqrt\left( {{{\left( {{x_1} - {x_2}} \right)}^2} + {{({y_1} - {y_2})}^2} + {{({z_1} - {z_2})}^2}} \right) where X₁ = (X₁₁,X₁₂,X₁₃, ⋯ , X_1n), X₁ = (X₂₁,X₂₂,X₂₃, ⋯ , X_2n) is n-dimensional data. Every two landmarks can be calculated by the formula based on Euclidean distance. If the difference is less than the threshold (set by the coach), the two landmarks are considered similar; if the difference is greater than this threshold, the two landmarks are different. Figures 8 and 9 show standard actions and actions to be tested. Figures 8 and 9 show the comparison of the trajectory of the same mark point between the standard action and the action to be measured.

Comparison of motion trajectories in a single direction between individual landmarks.

Marker data based on Euclidean distance.

The direct comparison method based on Euclidean distance is to compare the data of two actions, and the corresponding distance value will be obtained during the comparison of each action sequence. According to the preset threshold, the percentage of data matching degree can be obtained. However, this method is too computationally intensive. In addition, the displacement of such landmarks is a reference standard, which is of little significance in the comparative training process [11].

3.2.2

Similarity calculation based on feature plane matching

The similarity between the dance performers and standard movements in this paper can be attributed to the similarity of geometric shapes between objects. It is known that three feature points determine a feature plane, and the skeleton of the human pose is mainly composed of seven feature planes, as shown by the shadow in Figure 10. In this paper, the angle relationship between the vector of joint points is used to evaluate the difference between dancer's movement and standard movement. According to the principles of ergonomics, the human body uses the spine as the main axis, the spine as the z-axis of the spatial rectangular coordinate system and the x-axis and y-axis of the horizontal plane as the ground plane of the motion capture device. The comparison of the similarity of the edge vectors of a plane and the similarity of normal vectors between the planes is given.

The main gestures of the human body include the following:

Movement of limbs. Taking the left arm as an example, comparing the amplitude of the left arm's elbow swing can compare the similarity between the side vector VLLarm and the side vector VLFarm of the left arm plane P1; compare with the vector VLFarm correlation; the rotation direction of the arm changes the inner product of the left arm plane normal vector V1 and the vertical Vstand, as shown in Table 1.

Head movements. The normal vector V5 of the plane P5 is compared with the vertical standing direction Vstand. When the human body looks straight ahead, V5 is parallel to the Vstand direction.

Chest movement. The main reason is to compare the transformation angle between the spine direction vector V6 and the vertical standing direction Vstand when doing a turning movement.

Hip movement. When the human body stands vertically, the hip plane P7 remains horizontal, and its plane normal vector V7 is parallel to the vertical direction Vstand, as shown in Table 2 [12].

Table 1

Characteristic vectors of limbs.

Limb movement posture	Feature vector

Stand upright	V_stand = Positive Z axis direction in world coordinate system
Left upper arm	V_LFarm = CoorLElbow − CoorLShoulder
Left lower arm	V_LLarm = CoorLElbow − CoorLHand
Right upper arm	V_RFarm = CoorR Elbow − CoorRShoulder
Right lower arm	V_RLarm = CoorR Elbow − CoorRHand
Left thigh	V_LThigh = CoorL Knee − CoorLHip
Left calf	V_LCrus = CoorL Knee − CoorLAnkle
Right thigh	V_RThigh = CoorR Knee − CoorRHip
Right calf	V_RCrus = CoorR Knee − CoorRAnkle
Left arm feature plane	V₁ = V_LLarm × V_LFarm
Right arm feature plane	V₂ = V_RLarm × V_RFarm
Left leg feature plane	V₃ = V_LThigh × V_LCrus
Right leg feature plane	V₄ = V_RThigh × V_RCrus

Table 2

Torso pose feature vectors.

Torso movement posture	Feature vector

Head direction	V₅ = V_LHead × V_RHead
Chest direction	V₆ = V_chest
Hip direction	V₇ = (Coornavel − CoorLHip) × (Coornavel − CoorRHip)

To sum up the related calculation process, this paper uses the cosine similarity as the similarity function. By measuring the cosine of the angle of the inner product of two vectors in space, it can be measured by obtaining the size of the difference between them. Compared with the Euclidean distance metric, the cosine similarity pays more attention to the difference in direction between the two vectors. The calculation method is as follows: (3) $similarity (θ_{i}) = \frac{\sum_{t = 1}^{n} A_{t} \times B_{t}}{\sqrt{\sum_{t = 1}^{n} {(A_{t})}^{2}} \times \sqrt{\sum_{t = 1}^{n} {(B_{t})}^{2}}}$ similarity\,\left( {{\theta _i}} \right) = {{\sum\limits_{t = 1}^n {A_t} \times {B_t}} \over {\sqrt {\sum\limits_{t = 1}^n {{\left( {{A_t}} \right)}^2}} \times \sqrt {\sum\limits_{t = 1}^n {{\left( {{B_t}} \right)}^2}} }} Among them, θ_i is the joint angle; At and Bt are the corresponding feature plane edge vectors. The calculated cosine value is [0,1] (if the dancers are performing dance moves in the same direction). If the value is close to 1, it indicates that the movement of the dancer to be tested is consistent with the standard movement and dance norms; if the value is close to 0, it indicates that the movement of the dancer to be tested is too large.

Cosine similarity can not only measure the degree of difference between vectors but also measure the similarity and difference between angles. Because there are individual differences between human bodies, such as different problems such as height, weight, arm length, etc., the proportion of human bodies is constant. Therefore, the similarity of the angles can also be used to measure whether the motion amplitude of the limb meets the standard. The calculation method is as follows [13]: (4) $Corr (A, B) = 1 - (\frac{\arccos (similarity (θ_{t}))}{π})$ Corr\left( {A,B} \right) = 1 - \left( {{{\arccos\, \left( {similarity\,\left( {{\theta _t}} \right)} \right)} \over \pi }} \right) According to Eqs. (3) and (4), calculate the correlation parameters of the main motion postures at the arms and compare the key motion correlation parameters of the test object with the standard motion.

Table 3

Posture correlation parameters of the left arm standard movement.

Timing	Sim (V₂, V_stand)	Corr (V_LLarm, V_LFarm)	Corr (V_LFarm, V_stand)

0–1 s	0.6442	0.9365	0.9586
1–2 s	0.7004	0.9623	0.9956
2–3 s	0.7257	0.9726	0.9659
3–4 s	0.7126	0.9759	0.9546
4–5 s	0.6897	0.9588	0.9849
5–6 s	0.6416	0.9659	0.9546
6–7 s	0.5659	0.9588	0.9659
7–8 s	0.7258	0.9649	0.9785
8–9 s	0.6896	0.9876	0.9416
9–10 s	0.6585	0.9659	0.9755

Table 4

Parameters related to the posture of the left arm to be measured.

Timing	Sim (V₂, V_stand)	corr (V_LLarm, V_LFarm)	corr (V_LFarm, V_stand)

0–1 s	0.6357	0.9433	0.9576
1–2 s	0.6958	0.9613	0.9879
2–3 s	0.7253	0.9715	0.9646
3–4 s	0.7110	0.9698	0.9559
4–5 s	0.6906	0.8399	0.9848
5–6 s	0.6506	0.9648	0.9547
6–7 s	0.5998	0.9498	0.9646
7–8 s	0.7199	0.9650	0.9689
8–9 s	0.6826	0.9798	0.9398
9–10 s	0.6595	0.9698	0.9698

3.3

Calculation of differences between feature poses

This experiment uses dance music under the same rhythm to ensure that the two groups complete the same specified dance movements in the same time period, regardless of time regularity. First, two groups of people complete an action at the same time and calculate the correlation coefficient corr of each part of its action. Second, calculate the correlation coefficient error, using the relative error method, and the calculation method is as follows: (5) $Δ {corr}_{t} = \frac{| {corr}_{i} - {corr}_{j} |}{{corr}_{i}} %$ \Delta cor{r_t} = {{\left| {cor{r_i} - cor{r_j}} \right|} \over {cor{r_i}}}\% Among them, t is the time point of the dance action, corr_i is the standard relative coefficient of the dance action and corr_j is the relative coefficient of the dance action to be measured. The basis for judging the action standard is the relative error in calculating the correlation coefficient. The condition for error convergence is given as follows: (6) $Δ {corr}_{t} \leq C$ \Delta cor{r_t} \le C Among them, C is the selected error threshold, and the corresponding error threshold can be modified according to the teacher's requirements for the dance action. Calculate the relative error according to formula (5) to indicate the deviation of the dance movements of each object from the standard movements. The calculation results are shown in Table 5.

Table 5

Relative error of main movement attitude of test object/%.

Timing	Sim V₂, V_stand)	Corr V_LLarm, V_LFarm)	Corr V_LFarm, V_stand)

0–1 s	1.32	0.72	0.11
1–2 s	0.65	0.10	0.77
2–3 s	0.05	0.11	0.13
3–4 s	0.22	0.63	0.14
4–5 s	0.13	12.54	0.01
5–6 s	1.41	0.11	0.01
6–7 s	5.65	0.95	0.13
7–8 s	0.82	0.01	0.99
8–9 s	1.02	0.79	0.19
9–10 s	0.15	0.40	0.58

Analysis of experimental results

The experimental platform is Core i5-3470. Then, 2 GHz CPU, 4 GB memory PC and MATLAB were used as the development environment. The created motion database contains 18 sets of dance action fragments, each of which is about 600 frames. The experimental subjects are randomly selected college students. The experimental subjects have a basic dance. Firstly, the subject is required to imitate the standard movements of the dance teacher, make corresponding dance movements under the optical motion capture system, extract the motion characteristics of the joint points of the left arm of the subject and take a local motion sequence captured in real time as an example. The main movement changes of the subject were analysed differently from the standard movements. This article mainly compares the final movement of the left arm with a single dance movement (within 0 to 10 s) for experimental comparison [14].

In the experiment, the error threshold is set to C ≤ 2, and the feature pose differences are compared to select the parts that do not meet the threshold. The degree of bending of the elbow in the time interval of 4 to 5 s and the swing amplitude of the left arm of the time interval of 6 to 7 s were significantly different from the standard movement. In this paper, the feature plane is used as the basic calculation plane, and three discriminative parameters are calculated. Figure 11 shows the comparison chart of the left arm movement posture difference.

Timing chart of left arm motion correlation parameters.

The traditional three-dimensional model similarity comparison method based on Euclidean distance is used to directly compare the two sets of data. For dance moves with large amplitudes, the posture analysis results are often not accurate enough. Persons with different heights, who are fat and who are lean with different proportions of the human body will have deviations in the displacement of the model itself, as shown in Figure 12. Without specifying the movement position of the person to be tested, the spatial displacement deviation is often too large, as shown in Figure 13.

x-axis distance difference of a single feature point.

Euclidean distance difference of a single feature point.

Through the verification of the experimental results, the motion pose analysis method based on the similarity matching of feature planes can clearly and efficiently detect the differences and norms between moving objects, has high robustness and provides scientific theoretical support for dance scientific training [15].

Conclusion

This paper proposes a human pose analysis method based on similarity matching between feature planes. First, the optical motion capture system is used to collect the movement sequences of dance performers in real time, to obtain human skeleton data and to establish a human motion model; then, to use the method of feature plane similarity matching to calculate the correlation degree of motion data, finally, use this method to achieve the posture of a moving human body. Real-time analysis and effective application in dance teaching, comparing the formativeness and consistency of teaching movements, is conducive to the realisation of computer-aided dance teaching mode and improve teaching quality [16]. The experimental data verify that the method has good accuracy for real-time analysis of human posture and is of great significance for the realisation of digital teaching of dance. It also lays a good foundation for the next step of retrieval of dance movement data.

eISSN:: 2444-8656
Idioma:: Inglés

Calendario de la edición:: Volume Open
Temas de la revista:: Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics

RSS Feed de revista

Research on aerobics training posture motion capture based on mathematical similarity matching statistical analysis

Publicado en línea: 22 nov 2021

Páginas: 203 - 216

Recibido: 17 jun 2021

Aceptado: 24 sept 2021

DOI: https://doi.org/10.2478/amns.2021.2.00055

Palabras clave
motion capture, free editing, skeleton model, feature plane, similarity matching, pose analysis, scientific training

© 2021 Qiuju Chen et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 12

Fig. 13

Research on aerobics training posture motion capture based on mathematical similarity matching statistical analysis

Publicado en línea: 22 nov 2021

Páginas: 203 - 216

Recibido: 17 jun 2021

Aceptado: 24 sept 2021

DOI: https://doi.org/10.2478/amns.2021.2.00055

Palabras clavemotion capture, free editing, skeleton model, feature plane, similarity matching, pose analysis, scientific training

© 2021 Qiuju Chen et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Fig. 7

Fig. 8

Fig. 9

Fig. 10

Fig. 11

Fig. 12

Fig. 13

Palabras clave
motion capture, free editing, skeleton model, feature plane, similarity matching, pose analysis, scientific training