Optimization of Wushu Sanshou Technical Movement Recognition and Training Based on Machine Learning Algorithm

Wushu Sanda, also known as Sanshou, is a major branch of Chinese Wushu, which is one of the valuable legacies of traditional Chinese culture and embodies the national characteristics of traditional Chinese culture. For Chinese martial arts, the most important thing is its practicality, which occupies an important position, and Sanda is the ultimate embodiment of its practicality [1]. Wushu Sanda is a modern competitive sports program that takes two people's unarmed confrontation as the form of expression, kicking, hitting and wrestling as the method of technical combat, taking the rules of the game as the guide of the movement, and improving the technical combat ability as the purpose of the behavior, which is the main component of Chinese Wushu. Since the beginning of the Wushu Sanda sport trial, it has been constantly drawing the essence of Chinese Wushu, and borrowing and absorbing the technical movements of the world's best fighting programs, which has promoted the development of Wushu Sanda program. It can develop rapidly, mainly because it has a strong vitality and infectious force, is determined by the sports, confrontation, national characteristics of Sanda, but also for the promotion of college Wushu Sanda curriculum laid the foundation [2–4]. 2008, Wushu Sanda became an ad hoc competition of the Beijing Olympic Games, the development of Wushu Sanda situation is even more rapid. It promotes the continuous improvement of Sanda techniques, tactics and rules, and makes Wushu Sanda an indispensable part of the world combat sports. And Wushu Sanda is divided into segment level and weight level in the competition [5], but no matter how the level is, its technical action is the key to judge the score. These technical movements of the actual combat posture, footwork, punching, kicking, and wrestling are the basic skills of Wushu Sanda in competitions or regular exercise [6]. Sanda training skills can be summarized as eight words: fast, heavy, long, stable, alive, skillful, no, turn, accurate [7]. The diversity of technical movements in Wushu Sanda is a focus of attention in training, competition, and teaching, so it is important to identify and optimize the movements precisely and train them.

In this paper, wavelet transform is utilized to denoise the captured action images. Design the human body dynamic and static feature extraction module, combined with the existing foundation of skeletal point action recognition model, add the attention mechanism (self-attention mechanism and CBAM attention module), update the continuous attention (CA) structure, and improve the model recognition rate of global features and local features. Design the experimental environment and obtain the Wushu Sanshou action dataset based on the dataset creation step. Apply the comparison method to test the performance of the improved model in this paper. Combining the key points of Wushu Sanshou movements and the principles of posture recognition, the core muscle stability training method for Sanshou movements is proposed. Design the teaching training and analyze the teaching effect of the core muscle group stability training method for sparring movements.

2

Overview

The use of technology in the Olympic Games has helped judges to recognize and score athletes' movements. For example, artificial intelligence is used to recognize and evaluate wushu movements [8] and to score wushu sparring efficiently and accurately automatically [9]. Currently, methods for martial arts movement recognition include IBS distance measurement CCNN model [10], combining bone point features with joint kinetic energy model [11], LSTM recursive neural network algorithm [12], and multi-sensor data fusion for IoT environment [13]. In addition, literature [14] analyzed the leg movements in sparring using 3D image detection method, which can provide a reference for its training. Similarly, literature [15] utilized 3D images to reconstruct the technical movement features of sparring to improve the training level. And the literature [16] is feature extraction modeling of Wushu sparring players' movements, which directly analyzes their body features and movement characteristics to assist training. The training of martial arts sparring is most scientific and appropriate, and computer vision, which integrates computer science, digital image processing, machine learning, physics, biology, etc., has done the recognition of martial arts movements, posture estimation, analysis of technical movements, and scientific training methods [17].

3

Machine Learning Based Sparring Posture Recognition

3.1

Image Preprocessing

Due to the presence of noise in the original image obtained by the acquisition system, direct feature extraction in the original image will reduce the accuracy of the later recognition algorithm to a certain extent. Therefore, in the pre-feature extraction needs to be de-noised image processing. In this paper, we use the simpler wavelet transform method to remove the Gaussian noise contained in the image [18–19]. Let the original image signal containing noise be as follows: 1 $y_{i} = f (t_{i}) + e_{i} i = 1, 2, \dots, n$ where e_i represents the noise value and f(t_i) represents the original uncontaminated signal. To accomplish the noise elimination it is necessary to calculate an estimate of $\hat{f} (\cdot)$ , so let c₀ = y_i be the initial signal sequence of the noise-containing signal. The multilevel decomposition of c₀ using orthogonal wavelet transform is processed as follows: 2 $\begin{array}{l} c_{j + 1} & = D_{e} H c_{j} \\ d_{j + 1} & = D_{e} G c_{j} \end{array}$ where c_j+1 is the approximation signal. d_j+1 is the detail signal, G is the Gaussian filter. H is the low-pass filter, and D_e is the sampling operator. The approximation signal c_j and the estimated value of the detail signal d₁, d₂, ⋯, d_j are obtained after decomposition as follows: 3 ${\hat{d}}_{j} = {\begin{array}{l} {\bar{d}}_{j} & 1 \leq j \leq j_{0} \\ d_{j} & j_{0} < j \leq J + 1 \end{array}$ where j₀ denotes the truncation parameter of the low-resolution image, and the actual value of ${\bar{d}}_{j}$ can be derived from the wavelet threshold threshold. If Q denotes the threshold threshold, there are two cases of thresholding of d_j, hard thresholding and soft thresholding, and the hard thresholding is: 4 ${\bar{d}}_{j} = {\begin{array}{l} d_{j} & | d_{j} | \geq Q \\ 0 & O t h e r \end{array}$

The soft-threshold treatment can be expressed as follows: 5 ${\bar{d}}_{j} = {\begin{array}{l} sgn (d_{j}) (| d_{j} | - Q) & | d_{j} | \geq Q \\ 0 & O t h e r \end{array}$

The magnitude of the threshold threshold Q depends on the magnitude of the mean square deviation σ of the noise, which is estimated for σ using the first layer of detailed signals as follows: 6 $σ = m e d i a n (| d_{1} |) / 0.6745$

By reconstructing the approximation signal and the final processed detail signal, the estimated value of the noise signal can be obtained to complete the denoising of the image.

3.2

Human body static feature extraction

The human skeleton contains information about the shape of the limbs and also the structural topology of the limbs. In this paper, the movement of human knee joint, hip joint, elbow joint and torso part is studied. The degrees of freedom of the hip and shoulder joints are set to 3, and the degrees of freedom of the elbow and knee joints are set to 1. As for the shoulder joints, they can be treated as an encircled sphere, and the angle of movement of the shoulder joints is lower due to the structural limitations of the human body, so that the two shoulder joints can be represented in localized spherical coordinates. The spherical coordinates contain three variables, which are elevation angle, azimuth angle and radial distance. However, since the human body has the characteristic of scale invariance during movement, i.e., the eigenvalues are not affected by the position held nor the radial distance.

In the localized spherical coordinate system, the vector of the frontal direction of the human body is defined as Or, the vector of the limb pointing direction is defined as Ov, the projection of Or and Ov on the plane L is defined as Or′ and Ov′ respectively, and the azimuthal angle is defined as the angle formed by rotating Or′ counterclockwise to Ov′, θ, and the angle between Ov and ON is defined as ϕ. From this, we can derive the angles of motion of the hip joints and the shoulder joints as follows: 7 $S h o u l d e r / H i p = (θ, ϕ)$

Compared to the rest of the joints, the knee and elbow joints have lower degrees of freedom, and it is only necessary to give the position of the plane in which their joint angles are located. Define the angles of the knee and elbow joints as follows: 8 $E l b o w / K n e e = {α}$

In addition to the limbs, the movement of the torso of the human body during the performance of martial arts movements was also defined. The angle of rotation of the trunk is described by the frontal orientation of the body. However, when performing bending and side rotation it is necessary to accurately obtain the angle between the torso and the vertical axis when it is in the coronal and sagittal positions to express the feature. The two angles are obtained by equation (9): 9 $S a = {β} C o = {γ}$

Where Sa is the sagittal plane and Co is the coronal plane. After defining the joints of the torso and each limb respectively, the joint angles of the human body in static posture can be characterized as follows: 10 $S t a i c F e a t u r e = {R S, L S, R E, L E, R H, L H, R K, L K, S a, C o}$

Where L and R represent left and right respectively, S is the shoulder joint, E is the elbow joint, H is the shoulder joint, and K is the knee joint. At this point, Static Feature is defined as a 16-dimensional feature vector, but it is also possible to choose a suitable combination of features according to the actual research objectives.

3.3

Human body dynamic feature extraction

The edge feature segmentation method is used for the visual feature expression processing of martial arts sparring and whipping action, and the edge rotation feature analysis model of the visual image of martial arts sparring and whipping action is established, and the visual distribution function of the martial arts sparring and whipping action is obtained as: 11 $g = k \otimes f + n$

Where ⊗ denotes the morphological filter operator. The fusion processing of the collected visual images of the martial arts sparring whip-leg action is performed to establish the feature distribution set of the martial arts sparring whip-leg action, and the connected distribution set of the martial arts sparring whip-leg action is obtained as: 12 $s_{P P M} (t) = \sum_{i = - \infty}^{\infty} \sum_{j = 0}^{N_{r} - 1} p (t - i T_{x} - j T_{p} - c_{j} T_{c} - a_{t} ε)$ 13 $s_{P A M} (t) = \sum_{j = - \infty}^{\infty} d_{j} p (t - j T_{n})$

The T_s in Eq. is the amount of regional edge localization features. The watershed image segmentation method is used for the reconstruction of Wushu sparring whip-leg action, and the dynamic feature decomposition model is established, expressed as: 14 $x (t) = \sum_{m = 1}^{M} \sum_{k = 1}^{K (m)} w_{n k} s (t - T_{m} - τ_{m k}) + v (t)$ where w_mk is the edge feature component of the martial arts sparring whip-leg action. Using the mathematical morphology method, the three-dimensional modal output of the visualization of the martial arts sparring whip action is obtained as: 15 ${\begin{array}{l} x = R \sin η \cos ϕ & 0 \leq ϕ \leq 2 π \\ y = R \sin η \sin ϕ & 0 \leq η \leq π \\ z = R \cos η & R = D / 2 \end{array}$

Where η denotes the visual segmentation function of martial arts sparring whip leg action and ϕ denotes the visual segmentation coefficient of martial arts sparring whip leg action. The output of the high-resolution reconstruction of the visualization of the martial arts sparring whip-leg action is carried out by using a suitable fusion rule method: 16 $\begin{array}{l} D_{(i + 1)} & = B_{(i + 1)} C_{(i + 1)} = B_{(i)} C_{(i)} - β_{i + 1}^{- 1} (B_{(i)} C_{(i)} w_{i + 1}) \frac{w_{i + 1}^{T} λ_{i}^{- 1} C_{(i)}}{λ_{i}^{- 1} β_{(i + 1)}^{- 1} w_{i + 1}^{T} C_{(i)} w_{i + 1} + 1} \\ + β_{i + 1}^{- 1} x_{i + 1} (λ_{i}^{- 1} w_{i + 1}^{T} C_{i i}) - β_{i + 1}^{- 1} x_{i + 1} \frac{β_{i + 1}^{- 1} λ_{i}^{- 1} w_{i + 1}^{T} C_{i (i)} w_{i + 1}^{- 1}}{β_{i + 1}^{- 1} λ_{i}^{- 1} w_{i + 1}^{T} C_{(i)}^{T} w_{i + 1}^{T} + 1} w_{i + 1}^{T} λ_{i}^{- 1} \end{array}$

The empirical wavelet transform method was used to represent the movement of the martial arts sanda whip leg, and the block area of the martial arts sanda whip leg action was obtained as M × N subblock G_m,n of 2 × 2. Combining Modular Area iixel eeorganization Methods for iisual eepresentation of Wushu Sanshou Whip Leg Movements.

3.4

Wushu sparring action recognition model based on dual attention mechanism

3.4.1

Action Recognition Model Improvement Ideas

As skeleton action recognition models are increasingly used in real-life scenarios, especially in the athletic competition of sparring. If there is a wrong judgment or omission, the judgment may affect the final result of the match. This is unfair to the sparring athletes, so the accuracy of the action recognition model puts forward higher requirements. This paper focuses on the improvement of the action recognition network in the skeletal point action recognition model. In order to improve the accuracy of the model, this paper designs an action recognition model based on dual attention mechanism and proposes an action recognition network.

3.4.2

Design of the general framework of the model

The overall framework of the dual-attention based action recognition model for martial arts sparring proposed in this paper consists of a human body detector, a human body pose estimator, a heat map generation operation, a heat map stacking operation, and an action recognition network based on the eesNet50 architecture.

The human body detector uses Faster-eCNN and the human pose estimator uses DWiose, which converts the coordinate triplet data storing the skeletal point information into a Gaussian mapping that is used for the heat map. The heat map stacking operation is stacked along the time dimension. The action recognition network is an improved network combining self-attention and CBAM attention modules on the eesNet50 network proposed in this paper.

The model firstly labels the regions of the human body using the Faster-eCNN detector, and then extracts the data of 17 skeletal keypoints by the pose estimator DWiose and saves them as coordinate triplets. And the heat maps are obtained by Gaussian mapping, and k joint-centered Gaussian map is combined at each joint to obtain the joint heat map J, as shown in Eq. (17): 17 $J_{k j j} = e^{- \frac{{(i, x_{k})}^{2} + {(j - y_{k})}^{2}}{2 * σ^{2}}} * c_{k}$

σ controls the variance of the Gaussian plot, while (x_ky_k) and c_k are the horizontal and vertical coordinates and confidence scores of the k th joint respectively similar to the joint heat map J. A limb heat map L is also created as shown in equation (18): 18 $L_{k i j} = e^{- \frac{D {((i, j), s e g [a_{k}, a_{k}])}^{2}}{2 * σ^{2}}} * \min (c_{a k}, c_{b k})$

L_kij represents limb k between two joints a_k and b_k. Function D calculates the distance of point [(x_ak, y_ak),(x_bk, y_bk)] and finally combines the joint heat map and the limb heat map (L + J).

In order to feed them into the action recognition network, they are reformulated as a 3D heat map volume. Thus, the spatial and temporal logical relations are fully utilized.

3.4.3

Sparring action recognition network

1)

Introduction of self-attention mechanism

A total of two self-attention layers are introduced in the proposed network, which are placed after the CBAM1 and CBAM2 modules. It is used to compute the attention weights between elements in a sequence in order to capture the dependencies between elements.

The multi-head self-attention mechanism achieves feature representation in multiple subspaces by employing different parameters in multiple heads, which in turn can capture richer feature information. Define the input of the multi-head self-attention mechanism as X ∈ R^n×d, where d denotes the encoding dimension. The whole process can be represented as: 19 $Q_{i} = X W_{q} \in R^{n \times d}, K_{i} = X W_{k} \in R^{n \times d}, V_{i} = X W_{v} \in R^{n \times d}$ 20 $h e a d_{i} = \frac{s o f \max (Q_{i} K_{i}^{T})}{\sqrt{d}} V_{i}$ 21 $M S A (X) = c o n c a t (h e a d_{1}, h e a d_{2} \dots, h e a d_{n}) W_{o}$

Where, Q_i, K_i, V_i denote the result of linear transformation of the input vector by the i th header respectively. W_q, W_k, and W_v are the weight parameters of Query, Key, and ialue mappings, respectively, which map the input to the d -dimensional output, concat denotes the splicing operation, and W_o is the weight matrix of the final linear transformation.

Feedforward neural network (FFN) performs a nonlinear transformation of the output of the self-attention mechanism. The feedforward neural network acts independently at each position, introducing nonlinearity by employing an activation function (usually ReLU): 22 $F F N (M S A (X)) = Re L U (M S A (X) \cdot W_{1} + b_{1}) \cdot W_{2} + b_{2}$ where W₁, W₂, b₁, and b₂ denote the weights and biases of the feedforward neural network, respectively.

Each sub-layer (self-attention mechanism and feedforward neural network) is followed by the inclusion of residual connections with layer normalization, which is used to enhance the stability of model training. The whole process is as follows: 23 $F = L a y e r N o r m (F F N (L a y e r N o r m (M S A (X) + X)) + X)$

The introduction of the self-attention mechanism can help the model automatically learn which information is relevant and which is redundant, and reduce redundant information, thus improving the generalization ability of the model and reducing the risk of overfitting.

2)

Introduction of CBAM attention mechanism

In the proposed network structure CBAM is introduced after Layer2, Layer3, Layer4 and the global maximum pooling layer. CBAM helps the model to be able to extract and utilize the information more efficiently during deep learning by weighting the output feature maps of each layer channel-wise and spatially respectively [20].

3)

Continuous Attention Structure Design

The self-attention mechanism and CBAM module introduced in the network structure in order to improve the accuracy of the model are not simply superimposed in the network but a structure similar to residual links is designed. It is named as Continuous Attention (CA) structure.

The proposed continuous attention mechanism is inspired by the spatial attention module of the Convolutional Block Attention Module (CBAM), which utilizes the maximum pooled output and the average pooled output to compute the spatial attention map.The CA module can be defined as: 24 $\begin{array}{l} A t t n^{i} = & M^{i} (F_{c o m}^{i}, A t t n^{i - 1}) = σ (f_{1 \times 1} [P_{M c} F_{c o m}^{i} : P_{A c} F_{c o m}^{i}]) \\ \otimes P_{M} A t t n^{i - 1} \end{array}$ where M ⁱ is the i nd CA module, $F_{c o n v}^{i}$ denotes the features extracted from the first two, convolutional layers of layer i as inputs to the CA module, Attn^i–1 is the attention map of layer (i – 1), P_MCF_conv and P_ACF_conv describe the features of maximal pooling and average pooling, respectively, and P_M denotes the maximal pooling operation on the attention map of layer (i – 1) to match the size of the current layer's attention map.

4

Wushu Sanshou action recognition performance analysis

4.1

Experimental environment configuration

When conducting experiments, it is important to ensure the fairness and effectiveness of the experiments, and it is important to ensure the consistency between the training environment and the testing environment. Only in the same environment deployment, can we more effectively show the comparison effect with other algorithms. In this classification experiment, the training environment of the training dataset should be consistent with the testing environment of the testing dataset, and the experimental environment configuration is shown in Table 1.

Table 1.

Experimental environment configuration

Entry	Configuring
CPU	Intel® Core^™ i7-9750H @2.60GHz
Memory	64.0GB
Python	3.6
Pytorch	1.9.1

This experiment does not need to use GiU to accelerate the training, so there is no need to use CUDA and CUDNN for acceleration. The experimental part also uses open source libraries such as dataloader, tensorboard and pylab.

4.2

Data set acquisition

This solution is for real-time recognition of martial arts movements for martial arts sparring, whereas in the current research, there is no martial arts related dataset involved, and it is necessary to manually obtain the martial arts movement dataset.

Wushu action a specific kind of action, which is not available in daily life action. Therefore, it is necessary to establish a special martial arts action dataset. Wushu Sanshou set of martial arts movements contains repetitive movements for which the movement decomposition is split. The source of the standard video for the dataset is the standard instructional video of martial arts sparring movements. In the video, the martial arts teacher gives a standardized teacher's manual for each sparring movement and explains the decomposition of each sparring movement.

The detailed steps of dataset collection are as follows: STEP1:

The standard teaching video is truncated to a short video of 1-3 seconds based on the martial arts sparring movements, and the video FiS is 60 frames. Different martial arts teachers, different viewpoints and different durations of martial arts movements were also used in the dataset to ensure the diversity of the dataset.

STEP2:

The extracted short videos are fed into the human posture recognition network for human joint point recognition, and the human joint point data of the wushu teachers in each frame are obtained and saved. The data obtained after deflation and standard normalization is the processed data, and the acquired data is the processed standard coordinate data, in which the joint points that cannot be recognized in the masked part are used to complement 0.

STEP3:

The data are screened and saved to get the final martial arts movement training set. Among them, the validation set acquisition method is also the same. A total of more than 1800 data are collected for the training dataset sparring action.

4.3

Training strategies

In order to make the action recognition time as short as possible, the method of classifying the skeletal point data is used for Wushu sparring action recognition.

Therefore, the obtained dataset is experimented on the Wushu Sanshou action recognition model based on dual attention mechanism proposed in this paper with three kinds of neural networks: DNN, CNN and eNN. In this experiment, no pre-training weights are used and the training is started from scratch, and the initial parameter settings of the martial arts sparring action recognition model are shown in Table 2.

Table 2.

The initial parameter Settings of the action recognition model of martial arts

Parameter	Set value
Learning strategy	Step
Initial learning rate	0.00001
Batch size	6
Optimizer	Adam
Iteration number	500

To ensure the fairness of the experiment, the initial parameter settings of the four models are identical. During the training process, whenever the network training is finished, different experimental results will be obtained by constantly adjusting the initial learning rate, once to get the best experimental results.

4.4

Experimental results and analysis

Based on the given training strategy, a total of 500 epochs are trained with the same parameters except for the difference in the training network, and the training results obtained are shown in Figure 1.

From the results, it can be seen that the accuracy of the results on the test dataset is high. Although the learning rate, batchsize and other parameters are continuously adjusted, the loss fluctuates too much during the training period. After analyzing its main reason is training overfitting. The dataset used in this classification experiment is the dataset collected manually by ourselves, and the number of samples in the dataset is small. Due to the special characteristics of martial arts lead to fewer datasets related to martial arts, the experiment can be expanded to expand the dataset.

After training the training dataset, the weights obtained from the training are used to recognize the martial arts moves. The input videos are martial arts videos of individual students provided by a school.

To evaluate the efficiency of model recognition, FLOPs metric is often used in video action recognition.FLOPs is the number of floating point operations, which is one of the important metrics used to measure the computational complexity of the model and the amount of computation required by the model.GFLOPs is 1 billion floating point operations, which is used to measure the amount of computation of the model. Usually, the higher the GFLOPs, the greater the computational complexity of the model and the more computational resources and time are required to complete the reasoning process. Therefore, in applications with limited resources or high real-time requirements, a lower value of GFLOPs is more desirable, and a more computationally efficient model can be selected. Here, in this paper, the GFLOPs needed to obtain recognition results for action recognition for each video on average is used as a metric, GFLOPs/V, to evaluate the model's efficiency of recognizing video actions.

In order to verify the superiority of the action recognition method with the dual attention mechanism proposed in this paper, it was compared with mainstream human action classification methods in the same experimental environment. The classification accuracy as well as the GFLOPs/V index values in the school martial arts sparring sports dataset are reported respectively, and the statistics of the experimental results are shown in Table 3.

Table 3.

Experimental results

Action classification method	m AP	GFLOPs/V
ST-GCN	87.63%	55.17
AGCN	89.01%	54.32
PoseC3D	93.54%	36.85
MS-G3D	90.37%	42.01
OURS	95.22%	29.35

The network design eliminates the effect of redundancy on the action recognition results due to the dual-attention mechanism and uses a recognition termination strategy network to dynamically decide whether the recognition model has arrived at an action classification result, thus terminating the model inference and improving the model recognition efficiency. Comparing the mainstream similar models on the school-provided Wushu Sanshou sports dataset, the class average accuracy mAi value of this paper's model is 95.22%, and the GFLOPs/V value is 29.35. It is proved that the introduction of the CBAM attention mechanism drastically improves the recognition efficiency while obtaining better recognition accuracy.

5

Action training based on sparring action recognition models

5.1

Core Muscle Stability Training Model

Human balance refers to the ability of the human body to maintain its own stability, including the ability to maintain a certain posture or the ability to regulate the body to maintain balance when subjected to external forces, is one of the important physiological functions of the human body. The influence on balance lies not only in whether the whole body structure is complete and symmetrical, but also with the human vestibular organs, visual organs, body receptors, the brain's sense of balance, body tension and many other factors related to the degree of coordination. Therefore, balance refers to the interplay between the organism's internal physiological organs and external limbs, as well as the overall control of the body. The flexibility and suppleness of the human body also have an effect on the ability to balance, as well as on the stability of the body.

Core stability is a comprehensive quality, which is particularly prominent in sports, especially in various difficult competition programs. The development trend of modern competitive wushu sports requires more and more high requirements for the athletic performance of wushu athletes. There is also a higher demand for core stability of the competitors, so as to reduce the error rate of difficult movements in the competition in order to achieve the purpose of winning.

Balance ability as a selection index when, in addition to the skill-dominant category of difficult aesthetic programs such as gymnastics and diving and other sports. Balance ability in wushu is also a great concern for coaches, and its level will have a great impact on the performance of players in competition.

To sum up, Wushu is a skill-driven group of sports, and the movements of Wushu routines are variable, requiring constant changes in the center of gravity and direction of movement, as well as a certain degree of imbalance and difficulty in the movements. In addition to the basic athletic characteristics, Wushu also requires good stability, balance, flexibility, explosive power and the ability to recover quickly. The core muscles play an important role in the development of the characteristics required for this branch. Combined with the results of the analysis of the Wushu Sanshou movement recognition model based on the dual attention mechanism proposed in the previous paper, a Wushu Sanshou movement method focusing on core training is proposed here.

5.2

Training design

This experiment used the group comparison research method to compare and analyze the test results of the two groups of subjects before and after the experiment. In the 10-week training experiment, the two intervention groups adopted different training programs, and the relevant index data of the two groups of subjects before and after the experiment were recorded. 1)

Experiment time and location

This experiment was conducted from May 6, 2024 to July 13, 2024. There were 10 weeks in total, 2 times per week, 40-50 minutes each time. Among them, the first week at the beginning and the last week before the end were the time for the pre-test and post-test data of the experiment.

2)

Selection of experimental subjects

Fifty-six students specializing in martial arts in the physical education college entrance examination were selected as experimental subjects for this experiment (35 males and 21 females). Before the experiment, all the subjects were examined for their physical health to ensure that they had no sports injuries in the past month, that all the psychological and physiological indexes met the requirements of the experiment, and that they voluntarily accepted the experimental interventions in a normal state of physical condition.

According to the needs of this study, 56 students were grouped. Randomly grouped into control and experimental classes with 28 students in each group. According to the needs of the study, the basic information of the subjects in the two groups and the pre-test indicators of the experiment were tested for differences, which proved that there was no difference between the two groups of students and that the next experiment could be carried out.

3)

Experimental equipment and apparatus

Swiss ball, also known as “fitness ball”, is increasingly widely used in physical exercise and sports rehabilitation. In training, the Swiss ball exercise can create a kind of unstable movement state, which plays a great role in improving the muscle movement ability and improving the stability of joints. It can exercise the muscle groups of the chest, abdomen, back, buttocks, legs and other parts of the body, which is very helpful in maintaining body balance, improving posture and preventing sports injuries. At present, it is widely used in competitive sports and plays a great role in improving the strength and stability of athletes.

4)

Experimental design

2 times a week each 40-50 minutes of training, the two groups of research subjects have the same training content. The experimental group used the Swiss ball as the core training equipment, and the control group received core training on a smooth flat surface

Before and after the experiment, the two groups of students were tested for core stability, static and dynamic balance ability indexes. The test results were statistically processed to analyze the differences between the two before and after the experiment, and to study the effects of non-stable training on the core stability and balance ability of the students in the special class of wushu of the sports college entrance examination.

5.3

Training results and analysis

5.3.1

Dynamic stabilization component

Comparison of vertical reach stabilization time between groups (in msec) is shown in Table 4. There was no significant difference (p>0.05) in TTS Fz between the experimental and control groups before training. eepresenting the pre-training period, the initial values of TTS Fz were not different between the two groups. After the training, a significant difference was produced between the two groups (p<0.05), which indicated that after the core muscle group stability training of wushu sparring movements, the students in the experimental group became more and more stable in their wushu sparring movements.

Table 4.

The vertical direction is compared to the stable time group(msec)

		Experimental group	Control group	p
Intergroup comparison	Pretraining	796.65 ± 163.21	623.51 ± 255.18	0.14
Intergroup comparison	After training	701.52±211.63	773.09±304.21	0.037

The within-group comparison of vertical reach stabilization time is shown in Table 5. There was a significant difference between the experimental group in vertical reach stabilization time before and after training (p=0.003<0.05).

Table 5.

The vertical direction is compared in the stable time group(msec)

		Pretraining	After training	p
Group comparison	Experimental group	796.65 ± 163.21	701.52±211.63	0.003
Group comparison	Control group	623.51 ± 255.18	773.09±304.21	0.109

The statistics of time to reach stabilization (TTS Fx) in the forward and backward directions are shown in Table 6. From the statistics in the table, it is known that before training, the TTS Fx of the experimental group and the control group, did not differ significantly (p>0.05). eepresenting the TTS Fx of the two groups, there was no difference in the initial performance. The experimental group showed a significant difference (p<0.05) in TTS Fx, after movement core muscle stability training. The movement training, representing the experimental group, produced a significant effect on the TTS Fx of one-legged balance after landing on the flying foot in the air.

Table 6.

The front and rear are to the arrival of stable time statistics (msec)

		Experimental group	Control group	p
Intergroup comparison	Pretraining	1262.32 ± 502.31	1265.07 ± 412.25	0.212
Intergroup comparison	After training	1426.78 ± 336.29	1323.16 ± 362.68	0.007

The within-group comparisons of time to reach stabilization in the anterior-posterior direction (TTS Fx) are shown in Table 7. Before and after the experiment, the forward and backward direction arrival stabilization times within the experimental groups produced significant differences.

Table 7.

Comparison between front and rear to the stable time group (msec)

		Pretraining	After training	p
Group comparison	Experimental group	1262.32 ± 502.31	1426.78 ± 336.29	0.012
Group comparison	Control group	1265.07 ± 412.25	1323.16 ± 362.68	0.053

5.3.2

Static stabilization section

Comparison of total center of pressure offsets between groups (in cm) is shown in Table 8. The data showed that there was no significant difference (p>0.05) in the total center of pressure offset between the two groups before training, representing no difference in the initial performance of the two groups. After training, there was a significant difference (p<0.05) in the total offset of the center of pressure between the two groups, showing that the performance between the two groups appeared to be different after the adjustment of the Wushu Sanshou movement recognition and the stability training of the core muscles of the movement. Significant difference (p<0.05) was observed in the experimental group after training of sparring movements, showing in the performance of total center of pressure offset for independent balancing on one foot.

Table 8.

The comparison of the total offset in the pressure center (cm)

		Experimental group	Control group	p
Intergroup comparison	Pretraining	143.06 ± 12.21	141.08 ± 22.67	0.064
Intergroup comparison	After training	154.17 ± 25.04	141.93 ± 25.81	0.012

The within-group comparison of total center of pressure offsets is shown in Table 9. The experimental group developed a significant difference in the total center of pressure offset before and after the experiment, while the control group did not develop a significant difference before and after the experiment.

Table 9.

The comparison of the total offset in the pressure center (cm)

		Pretraining	After training	p
Group comparison	Experimental group	143.06 ± 12.21	154.17 ± 25.04	0.038
Group comparison	Control group	141.08 ± 22.67	141.93 ± 25.81	0.535

Maximal excursion in the anterior-posterior direction of the center of pressure (Maximal excursion X) is shown in Table 10.In terms of COP anterior-posterior maximal excursion, according to the table, there was no significant difference between the experimental group and the control group before training (p>0.05). eepresenting the initial performance of the two groups, there was no difference in the maximum offset in the anterior-posterior direction of COP.

Table 10.

The center of the pressure center is the maximum offset (cm)

			Experimental group	Control group	p
Intergroup comparison	Pretraining	3.26 ± 0.75	3.34 ± 1.23	0.211
Intergroup comparison	After training	3.75 ± 0.31	3.52 ±0.68	0.034

Whereas, after the core muscle group stability training of martial arts movements, the experimental group showed a significant difference in the maximum offset of COP in anterior-posterior direction (p<0.05). It represents that after the movement stability training, the maximum offset of COP in the anterior-posterior direction of the experimental group was optimized, which was manifested in the performance of the aerial flying foot landing. In terms of the maximum offset in the anterior-posterior direction of COP, the experimental group appeared to be different after the stability training of the core muscles of Wushu Sanda.

Within-group comparisons of the maximum offset in the anterior-posterior direction of the center of pressure are shown in Table 11. The p-value of the intra-group scores for the experimental group was less than 0.05, creating a significant difference.

Table 11.

The comparison of the maximum offset in front of the pressure center (cm)

		Pretraining	After training	p
Group comparison	Experimental group	3.26 ± 0.75	3.75 ± 0.31	0.018
Group comparison	Control group	3.34 ± 1.23	3.52 ±0.68	0.175

The maximum excursion Y in the left and right directions of the center of pressure (COP) is shown in Table 12.

Table 12.

The maximum offset in the direction of the pressure center (cm)

		Experimental group	Control group	p
Intergroup comparison	Pretraining	10.21 ± 5.63	11.04 ± 4.57	0.334
Intergroup comparison	After training	13.58 ± 3.17	12.55 ± 3.24	0.025

From the table, it was learned that there was no significant difference (p>0.05) in the maximum excursion in the right and left directions of the COP between the experimental group and the control group before training. It shows that before training, the initial performance of the aerial flying foot landing was the same for both groups in terms of the maximum offset in the left and right directions of the COP. While the experimental group showed a significant difference (p<0.05) in the maximum offset in the left and right directions of COP after training, representing that the Wushu Sanshou movement stability training had an effect on the experimental group.

The intra-group comparison of the maximum offset in the left and right directions of the center of pressure is shown in Table 13, and similarly, the experimental group formed a significant difference in the maximum offset in the left and right directions of the center of pressure before and after the experiment.

Table 13.

The center of the pressure center is compared to the maximum offset group(cm)

		Pretraining	After training	p
Group comparison	Experimental group	10.21 ± 5.63	13.58 ± 3.17	0.012
Group comparison	Control group	11.04 ± 4.57	12.55 ± 3.24	0.163

6

Conclusion

In this paper, we design a Wushu Sanshou action recognition network improved for skeletal point action recognition, and add the self-attention mechanism and CBAM attention module into the action recognition network to improve the extraction of local features by the network model. A sparring action dataset is constructed and validated to analyze the performance of the action recognition network model on the martial arts sparring dataset. The action training model focusing on the stability of core muscle groups is proposed by the skeletal points and action force generation points of Wushu sparring action. The Wushu sparring action recognition model with dual attention mechanism designed in this paper and the classical neural network model (DNN, CNN, eNN) are trained on the sparring action dataset respectively. The accuracy of the test set is high, and the dataset method constructed in this paper can be further utilized. Compared with mainstream human action classification methods, the GFLOPs/V value of this paper's model is 29.35, which has obvious classification advantages. The experimental group, after a ten-week core muscle group stability training for sparring movements, formed significant differences in the dynamic and static stabilization part of sparring movements from the preexperiment, which proved the scientific validity of movement training based on the skeletal points of sparring movements.

Idioma:: Inglés

Calendario de la edición:: 1 veces al año
Temas de la revista:: Ciencias de la vida, Ciencias de la vida, otros, Matemáticas, Matemáticas aplicadas, Matemáticas generales, Física, Física, otros

RSS Feed de revista

Optimization of Wushu Sanshou Technical Movement Recognition and Training Based on Machine Learning Algorithm

Yao Shang

Publicado en línea: 19 mar 2025

Recibido: 13 nov 2024

Aceptado: 15 feb 2025

DOI: https://doi.org/10.2478/amns-2025-0508

Palabras claveSkeletal point movement recognition, Attentional mechanism, Sustained attention structure, Wushu sporadic fighting

© 2025 Yao Shang, published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Palabras clave
Skeletal point movement recognition, Attentional mechanism, Sustained attention structure, Wushu sporadic fighting