Data mining is a practice that employs mathematical algorithms to search for hidden information in a large amount of data to analyse the underlying pattern and law, and this practice is also known as knowledge discovery in data. The National Basketball Association (NBA) is the professional basketball game at the highest level in the world, and many events in an NBA game are used for statistical analysis. In this paper, data mining technology was applied based on event statistics to quantify the ability of basketball players and teams, the aim of the exercise being to predict basketball results. According to the NBA (2013–2018) season competition data, the quantitative evaluation method was firstly used to establish a player ability evaluation model, and the feature variable selection history game data weighting method was selected to construct a team player ability evaluation feature system. Secondly, machine learning algorithms such as linear regression,
Keywords
- Data Mining
- Linear Regression
- Neural Network
- Prediction
The National Basketball Association (NBA) is a professional basketball league in the North American continent. Its enormous influence attracts many fans from all over the world. Because of the significant amount of money and the vast number of fans involved in the NBA league, there have been many studies attempting to predict the outcome of NBA games by simulating the winning team and analysing their players’ abilities so as to assist the coach in team organisation. NBA games have accumulated much historical game data and statistical analysis data. Even with this situation being what it is, analysing and predicting these games is still very complicated [1].
Statistics have always been a must for basketball player evaluation, from simple field-goal average to overall efficiency indicators such as the attack score introduced by Oliver (2004) [2]. Generally, a professional sport network is staffed with a professional analysis team, which is responsible for collecting and interpreting data from each game, and establishing statistical indicators based on the player performance in the actual play to measure the realisation level of players and teams. In predicting the performance of individual athletes, only a few statistical indicators can be used. In order to simplify the game analysis and accurately predict game results by using data, related technologies such as machine learning have been applied to predict outcomes of NBA games.
In this paper, machine learning was used to predict the performance of players before they played in the regulation game, by using FPTS collected from professional sports sites and their NBA basketball player statistical indicators, to predict outcomes of NBA games. Such prediction will only define the winning team, without considering its final scores in the game. Moreover, the effects of feature variables of basketball matches on game prediction were analysed, and feature selection was performed. The machine learning models adopted in this paper included the linear regression, extreme gradient boosting (
The paper uses a machine learning algorithm to predict the player performance based on past player data, and to predict team performance and the outcome of basketball games. Specifically, the data adopted, the feature variables constructed and the predicted evaluation objectives were analysed. First, the statistics of player games and player salary and position information during three regular seasons of NBA from 2015 to 2018 were used. For missing values in player data, the mean value or
Fig. 1
The influence weights of three weighting modes on matches, for the past 10 matches

Second, indicators were defined to quantify a player's ability, including total points (
There were other variables affecting player performance, such as home-court advantage, rest days, team sheets, player positions and salary, which were collected from the DraftKings’ algorithm or coaches’ decision-making system [8]. The variable (value) of a player was constructed as the ratio between salary divided by 1,000 and FPTS, which was treated as a heuristic algorithm. When the value is higher than 5, it indicates that the player is in a good state, with higher ability evaluation [9]. Moreover, the dimensionality resulting from more feature variables is solved by reducing the correlation between variables and selected important features. In this paper, considering the predictive effectiveness of variables for players, the following models were established to represent the quantitative relationship between variables and players’ ability values. The ability value
Finally, this paper aims to predict players’ ability value
Considering the prediction goal of this paper, it is obvious that the prediction is closely related to the regression. Therefore, the regression algorithm was adopted. The model comparison in this paper applies relevant machine learning regression algorithms, including linear regression,
For the linear regression, if a linear relationship exists between independent variables and dependent variables, it will meet the following equation:
There are
The least square method is used to estimate the parameters. This method can deduce the optimised parameters after the model training, so as to predict outcomes by regression.
Boosting is a method to transform a weak classifier into a strong classifier, whose function model is superposed. To be specific:
The objective function is a feature of
Optimising the solution:
When the objective function is determined, it moves to the training process. For each iteration process, the training of the objective function of a tree can be written as follows:
The input is the predicted value after the (
After a series of calculations, the smallest objective function
After putting it into the original equation, the minimum value obtained is:
So, the finally obtained
The neural network algorithms are widely used in all subdomains of artificial intelligence. They are briefly introduced in the literature, including the explained version.
Data were pre-processed, including filling the missing value, making the variable name uniform and carrying out variable standardisation. The mean value was used to fill in the missing data. The data from different sources were uniformly processed and the player data were standardised for
Except for the variables directly used to calculate FPTS, other variables were selected for models to quantify the player ability by using more advanced statistical methods. The details are discussed in part 2.
Using data from recent games allows a more objective and accurate prediction of the results of players’ abilities. Therefore, the weighted mean of the past 10 games was used to obtain each variable. The relevant theory shows that the mean weight of games increases linearly with the number of games. In this paper, the square root and linear and square modes were used for quantitative evaluation. It is necessary to normalise the weights such that the sum of the weights is 1. As shown in Figure 1, the weighted square mode is a better weighting method. Accordingly, it was considered as the best option in this paper.
For the consistency of feature variables, the standard deviation of the FPTS variable over the last 10 games was defined as FPTS_std, while salary information from DraftKings was also defined as the Salary variable. Before a game, the participation of a player in the game cannot be determined from the model. Thus, it is necessary to calculate the value of the model's feature variable according to the published player sheet [14] before the game.
Fig. 2
Feature importance ranking

Because of the different predictive abilities of features on game outcomes and correlations between features, the features should be filtered and ranked. Taking the field goal (FG) as an example, it is highly correlated with effective field-goal percentage (eFG%) because the latter considered far fewer free throw points. Furthermore, some variables have multiple collinearity, such as three pointers (3P), three pointers (3PA) and three-point percentage (3P%). In this paper, the Pearson correlation coefficient was used for screening features. With the setting correlation threshold of 0.90, the following six features were screened out: three-point shot (3PA), defensive rebound (DRB), field goals attempted (FAG), field-goal percentage (FG%) and offensive rating (ORtg). In addition, variables without predictive ability are directly ignored in models. The gradient descent method was used in models to evaluate and quantify the feature significance of 34 variables. Using feature ranking, such features as dummy variables of position (SG, F, C), three pointers (3P) and three doubles (TD) were excepted. Finally, the remaining 29 variables were used as selected features for regression, gradient enhancement and deep learning.
Fig. 3
Correlation matrix

The results of linear regression for all variables, using three different datasets, are shown in Table 1. A 5-fold cross validation was implemented when the linear regression model was trained. According to the linear model regression prediction, the minimum value of
Prediction effect of linear model
7.2526 | 7.2124 | 7.1963 | |
9.5823 | 9.5437 | 9.5356 |
MAE, Mean absolute error;
RMSE, Root mean square error.
For the
Max_depth: 5, n_estimators: 354, n_child_weight: 0, gamma: 0.8, learning_rate: 0.0152. The use of these parametric adjustment models will result in better performance (
In terms of the neural network model, Figure 4 shows the learning process of the model. A total of 20% of the training data was retained as validation data. It can be seen that the model soon starts to overfit, and verification losses are different from training losses. To prevent the model from overfitting, a loss layer was added, which randomly ignored 40% of the data points before feeding them forward to the last layer. In this paper, the EarlyStopping method in the Keras.callbacks model was applied. If verification is lost for no more than five periods, learning is terminated. This solution improved the model performance, with the RMSE reduced from 9.0678 in the original model to 9.0387.
Fig. 4
Neural network model training effect

Finally, the performance of these three models was compared, as shown in Table 2, where the models in the first and third rows calculated the mean value by using the defined linear combination of coefficients. In the fourth and fifth lines, the
Comparison of three machine learning models
Simple Average | 9.9434 | 7.4285 |
Weight Average | 9.7475 | 7.2059 |
Linear Regression | 9.2558 | 7.0478 |
8.9581 | 6.8486 | |
Neural Network | 9.0387 | 6.8805 |
MAE, mean absolute error;
RMSE, root mean square error;
XGBoost, extreme gradient boosting.
After the prediction calculation by models, and 5-fold cross validation, the consistent RMSEs (8.9910, 9.0522, 9.0148, 8.9351, and 9.0831) were obtained. In order to further improve the robustness of the model, the effect of small changes in input data on the performance of the model was also studied. First, the Gaussian noise was created by using mean 0 and variance 1, which was then scaled to the range of [0, 0.2] and added to the continuous variable in the original input. When the 5-fold cross validation was performed on the independent variable evaluated by using the noise, the losses seem to have barely budged, with
For the model,
Assuming that
The statistical significance was 0.1%, with four degrees of freedom and a critical value of 15.54. Therefore, the weighted mean, feature engineering and
In this paper, the steps and processes of solving application problems by using machine learning methods are discussed, i.e., analysing problems (predicting NBA games), searching research and processing data, feature selection, model training, evaluation and optimisation, etc. The indicators from DraftKings were used to predict how NBA players perform in regular games. The model prediction is run to minimise the RMSE between the predicted value and the actual FPTS statistics. It started from the basis linear model, where averages of the seasons’ past statistical data were used, together with weighting methods, to extract important features from selected features. The key feature variables for the model to predict a player's ability included salary information, team, player sheet provided by DraftKings, and other important statistical factors such as total rebounds and individual points scored. After feature selection and data normalisation, the
Moreover, the comparison between the predictions by models and FPTS actual statistics was verified in relevant games selected from the training data. The algorithm was used in five games broadcast on 10 March 2018, which produced the following eight player lineups, with an expected FPTS of 242.2643 and a total salary of US$ 49,900. The blue, orange and green bars stand for the actual FPTS, predictions by final model and predictions by basis liner mode, respectively. Below the names of the players their positions are mentioned. The final model predicted that the actual FPTS were much better than that of the basis liner modes of certain athletes such as Dillon Brooks (SF) and Dwight Powell (C). However, it tends to overestimate the FPTS of players such as Tomas Satoransky (G) and Kobi Simmons (F). Overall, the predictions from the match were very accurate with losses of 6.2836 (MAE) and 7.6538 (RMSE).
Throughout the data mining exercise, the feature importance and feature correlation matrix were essential to understanding how statistical indicators affect the predicated outcome of competition. Most importantly, data processing and feature extraction in this project have a great impact on the predicted results, which must be focused on. In particular, unifying names, handling missing values in data processing, and ranking and combination of important features in the feature selection process will greatly affect the final model training.
Fig. 5
Comparison between predicted and actual values

While the models used in this paper generally followed the performance comparisons of the algorithms themselves, the final improvement in RMSE was less than 10%. If, the opponents’ defensive data, such as a team's defensive rating and the positions of opposing players, are considered, the accuracy of prediction of the model might be improved. Furthermore, there is another important factor – namely, a coach determines when a player enters the field. When simulated players are used, it becomes possible to observe how the number of minutes varies under different game managements and, especially, how the formation tactics change during games. These factors can be modelled for quantitative evaluation and included within models. DraftKings also provides the views of news outlets and professional reviewers, which can be combined through natural language processing to be useful for performance prediction and formation optimisation. In conclusion, it is highly necessary to use the machine learning algorithm in the basketball game prediction and player ability quantitative evaluation system, and thus this usage is worthy of further research.
Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Prediction effect of linear model
7.2526 | 7.2124 | 7.1963 | |
9.5823 | 9.5437 | 9.5356 |
Comparison of three machine learning models
Simple Average | 9.9434 | 7.4285 |
Weight Average | 9.7475 | 7.2059 |
Linear Regression | 9.2558 | 7.0478 |
8.9581 | 6.8486 | |
Neural Network | 9.0387 | 6.8805 |
Law of interest rate changes in financial markets based on the differential equation model of liquidity Basalt fibre continuous reinforcement composite pavement reinforcement design based on finite element model Industrial transfer and regional economy coordination based on multiple regression model Satisfactory consistency judgement and inconsistency adjustment of linguistic judgement matrix Spatial–temporal graph neural network based on node attention A contrastive study on the production of double vowels in Mandarin Research of cascade averaging control in hydraulic equilibrium regulation of heating pipe network Mathematical analysis of civil litigation and empirical research of corporate governance Health monitoring of Bridges based on multifractal theory Health status diagnosis of the bridges based on multi-fractal de-trend fluctuation analysis Performance evaluation of college laboratories based on fusion of decision tree and BP neural network Application and risk assessment of the energy performance contracting model in energy conservation of public buildings Sensitivity analysis of design parameters of envelope enclosure performance in the dry-hot and dry-cold areas The Spatial Form of Digital Nonlinear Landscape Architecture Design Based on Computer Big Data Analysis of the relationship between industrial agglomeration and regional economic growth based on the multi-objective optimisation model Constraint effect of enterprise productivity based on constrained form variational computing The impact of urban expansion in Beijing and Metropolitan Area urban heat Island from 1999 to 2019 TOPSIS missile target selection method supported by the posterior probability of target recognition Ultrasonic wave promoting ice melt in ice storage tank based on polynomial fitting calculation model The incentive contract of subject librarians in university library under the non-linear task importance Application of Fuzzy Mathematics Calculation in Quantitative Evaluation of Students’ Performance of Basketball Jump Shot Visual error correction of continuous aerobics action images based on graph difference function Application of Higher Order Ordinary Differential Equation Model in Financial Investment Stock Price Forecast Application of Forced Modulation Function Mathematical Model in the Characteristic Research of Reflective Intensity Fibre Sensors Radioactive source search problem and optimisation model based on meta-heuristic algorithm Research on a method of completeness index based on complex model Fake online review recognition algorithm and optimisation research based on deep learning Research on the sustainable development and renewal of Macao inner harbour under the background of digitisation Support design of main retracement passage in fully mechanised coal mining face based on numerical simulation Study on the crushing mechanism and parameters of the two-flow crusher Interaction design of financial insurance products under the Era of AIoT Modeling the pathway of breast cancer in the Middle East Corporate social responsibility fulfilment, product-market competition and debt risk: Evidence from China ARMA analysis of the green innovation technology of core enterprises under the ecosystem – Time series data Reconstruction of multimodal aesthetic critical discourse analysis framework Image design and interaction technology based on Fourier inverse transform What does students’ experience of e-portfolios suggest Research on China interregional industrial transformation slowdown and influencing factors of industrial transformation based on numerical simulation The medical health venture capital network community structure, information dissemination and the cognitive proximity Data mining of Chain convenience stores location The optimal model of employment and entrepreneurship models in colleges and universities based on probability theory and statistics A generative design method of building layout generated by path Parameter Id of Metal Hi-pressure State Equation Analysis of the causes of the influence of the industrial economy on the social economy based on multiple linear regression equation Research of neural network for weld penetration control Intelligent Recommendation System for English Vocabulary Learning – Based on Crowdsensing Regarding new wave distributions of the non-linear integro-partial Ito differential and fifth-order integrable equations Research on predictive control of students’ performance in PE classes based on the mathematical model of multiple linear regression equation Beam control method for multi-array antennas based on improved genetic algorithm The influence of X fuzzy mathematical method on basketball tactics scoring Application of regression function model based on panel data in bank resource allocation financial risk management Research on aerobics training posture motion capture based on mathematical similarity matching statistical analysis Application of Sobolev-Volterra projection and finite element numerical analysis of integral differential equations in modern art design Influence of displacement ventilation on the distribution of pollutant concentrations in livestock housing Research on motion capture of dance training pose based on statistical analysis of mathematical similarity matching Application of data mining in basketball statistics Application of B-theory for numerical method of functional differential equations in the analysis of fair value in financial accounting Badminton players’ trajectory under numerical calculation method Research on the influence of fuzzy mathematics simulation model in the development of Wushu market Study on audio-visual family restoration of children with mental disorders based on the mathematical model of fuzzy comprehensive evaluation of differential equation Difference-in-differences test for micro effect of technological finance cooperation pilot in China Application of multi-attribute decision-making methods based on normal random variables in supply chain risk management Exploration on the collaborative relationship between government, industry, and university from the perspective of collaborative innovation The impact of financial repression on manufacturing upgrade based on fractional Fourier transform and probability AtanK-A New SVM Kernel for Classification Validity and reliability analysis of the Chinese version of planned happenstance career inventory based on mathematical statistics Visual positioning system for marine industrial robot assembly based on complex variable function Mechanical behaviour of continuous girder bridge with corrugated steel webs constructed by RW Research on the influencing factors of agricultural product purchase willingness in social e-commerce situation Study of a linear-physical-programming-based approach for web service selection under uncertain service quality A mathematical model of plasmid-carried antibiotic resistance transmission in two types of cells Burnout of front-line city administrative law-enforcing personnel in new urban development areas: An empirical research in China Calculating university education model based on finite element fractional differential equations and macro-control analysis Educational research on mathematics differential equation to simulate the model of children's mental health prevention and control system Analysis of enterprise management technology and innovation based on multilinear regression model Verifying the validity of the whole person model of mental health education activities in colleges based on differential equation RETRACTION NOTE Innovations to Attribute Reduction of Covering Decision System Based on Conditional Information Entropy Research on the mining of ideological and political knowledge elements in college courses based on the combination of LDA model and Apriori algorithm Adoption of deep learning Markov model combined with copula function in portfolio risk measurement Good congruences on weakly U-abundant semigroups Research on the processing method of multi-source heterogeneous data in the intelligent agriculture cloud platform Mathematical simulation analysis of optimal detection of shot-putters’ best path Internal control index and enterprise growth: An empirical study of Chinese listed-companies in the automobile manufacturing industry Determination of the minimum distance between vibration source and fibre under existing optical vibration signals: a study Nonlinear differential equations based on the B-S-M model in the pricing of derivatives in financial markets Nonlinear Differential Equations in the Teaching Model of Educational Informatisation Fed-UserPro: A user profile construction method based on federated learning The evaluation of college students’ innovation and entrepreneurship ability based on nonlinear model Smart Communities to Reduce Earthquake Damage: A Case Study in Xinheyuan, China Response Model of Teachers’ Psychological Education in Colleges and Universities Based on Nonlinear Finite Element Equations Institutional investor company social responsibility report and company performance Mathematical analysis of China's birth rate and research on the urgency of deepening the reform of art education First-principles calculations of magnetic and mechanical properties of Fe-based nanocrystalline alloy Fe80Si10Nb6B2Cu2 The Effect of Children’s Innovative Education Courses Based on Fractional Differential Equations Fractional Differential Equations in the Standard Construction Model of the Educational Application of the Internet of Things Optimization in Mathematics Modeling and Processing of New Type Silicate Glass Ceramics Has the belt and road initiative boosted the resident consumption in cities along the domestic route? – evidence from credit card consumption MCM of Student’s Physical Health Based on Mathematical Cone Attitude control for the rigid spacecraft with the improved extended state observer Sports health quantification method and system implementation based on multiple thermal physiology simulation Research on visual optimization design of machine–machine interface for mechanical industrial equipment based on nonlinear partial equations Research on identifying psychological health problems of college students by logistic regression model based on data mining Abnormal Behavior of Fractional Differential Equations in Processing Computer Big Data Mathematical Modeling Thoughts and Methods Based on Fractional Differential Equations in Teaching A mathematical model of PCNN for image fusion with non-sampled contourlet transform Nonlinear Differential Equations in Computer-Aided Modeling of Big Data Technology The Uniqueness of Solutions of Fractional Differential Equations in University Mathematics Teaching Based on the Principle of Compression Mapping Influence of displacement ventilation on the distribution of pollutant concentrations in livestock housing Cognitive Computational Model Using Machine Learning Algorithm in Artificial Intelligence Environment Application of Higher-Order Ordinary Differential Equation Model in Financial Investment Stock Price Forecast Recognition of Electrical Control System of Flexible Manipulator Based on Transfer Function Estimation Method Automatic Knowledge Integration Method of English Translation Corpus Based on Kmeans Algorithm Real Estate Economic Development Based on Logarithmic Growth Function Model Informatisation of educational reform based on fractional differential equations Financial Crisis Early Warning Model of Listed Companies Based on Fisher Linear Discriminant Analysis Research on the control of quantitative economic management variables under the numerical method based on stochastic ordinary differential equations Network monitoring and processing accuracy of big data acquisition based on mathematical model of fractional differential equation 3D Animation Simulation of Computer Fractal and Fractal Technology Combined with Diamond-Square Algorithm The Summation of Series Based on the Laplace Transformation Method in Mathematics Teaching Optimal Solution of the Fractional Differential Equation to Solve the Bending Performance Test of Corroded Reinforced Concrete Beams under Prestressed Fatigue Load Radial Basis Function Neural Network in Vibration Control of Civil Engineering Structure Optimal Model Combination of Cross-border E-commerce Platform Operation Based on Fractional Differential Equations Research on Stability of Time-delay Force Feedback Teleoperation System Based on Scattering Matrix BIM Building HVAC Energy Saving Technology Based on Fractional Differential Equation Human Resource Management Model of Large Companies Based on Mathematical Statistics Equations Data Forecasting of Air-Conditioning Load in Large Shopping Malls Based on Multiple Nonlinear Regression System dynamics model of output of ball mill Optimisation of Modelling of Finite Element Differential Equations with Modern Art Design Theory Mathematical function data model analysis and synthesis system based on short-term human movement Sensitivity Analysis of the Waterproof Performance of Elastic Rubber Gasket in Shield Tunnel Human gait modelling and tracking based on motion functionalisation Analysis and synthesis of function data of human movement The Control Relationship Between the Enterprise's Electrical Equipment and Mechanical Equipment Based on Graph Theory Financial Accounting Measurement Model Based on Numerical Analysis of Rigid Normal Differential Equation and Rigid Functional Equation Mathematical Modeling and Forecasting of Economic Variables Based on Linear Regression Statistics Design of Morlet wavelet neural network to solve the non-linear influenza disease system Nonlinear Differential Equations in Cross-border E-commerce Controlling Return Rate Differential equation model of financial market stability based on Internet big data 3D Mathematical Modeling Technology in Visualized Aerobics Dance Rehearsal System Children’s cognitive function and mental health based on finite element nonlinear mathematical model Motion about equilibrium points in the Jupiter-Europa system with oblateness Fractional Differential Equations in Electronic Information Models Badminton players’ trajectory under numerical calculation method BIM Engineering Management Oriented to Curve Equation Model Optimal preview repetitive control for impulse-free continuous-time descriptor systems Development of main functional modules for MVB and its application in rail transit Study on the impact of forest fire prevention policy on the health of forest resources Mathematical Method to Construct the Linear Programming of Football Training The Size of Children's Strollers of Different Ages Based on Ergonomic Mathematics Design Stiffness Calculation of Gear Hydraulic System Based on the Modeling of Nonlinear Dynamics Differential Equations in the Progressive Method Relationship Between Enterprise Talent Management and Performance Based on the Structural Equation Model Method Value Creation of Real Estate Company Spin-off Property Service Company Listing Selection by differential mortality rates Digital model creation and image meticulous processing based on variational partial differential equation Dichotomy model based on the finite element differential equation in the educational informatisation teaching reform model Nonlinear Dissipative System Mathematical Equations in the Multi-regression Model of Information-based Teaching The modelling and implementation of the virtual 3D animation scene based on the geometric centre-of-mass algorithm The policy efficiency evaluation of the Beijing–Tianjin–Hebei regional government guidance fund based on the entropy method The transfer of stylised artistic images in eye movement experiments based on fuzzy differential equations Research on behavioural differences in the processing of tenant listing information: An eye-movement experiment A review of the treatment techniques of VOC Some classes of complete permutation polynomials in the form of ( x p m −x +δ )s +ax p m +bx overF p 2m The consistency method of linguistic information and other four preference information in group decision-making Research on the willingness of Forest Land’s Management Rights transfer under the Beijing Forestry Development A mathematical model of the fractional differential method for structural design dynamics simulation of lower limb force movement step structure based on Sanda movement Fractal structure of magnetic island in tokamak plasma Numerical calculation and study of differential equations of muscle movement velocity based on martial articulation body ligament tension Study on the maximum value of flight distance based on the fractional differential equation for calculating the best path of shot put Sports intensity and energy consumption based on fractional linear regression equation Analysis of the properties of matrix rank and the relationship between matrix rank and matrix operations Study on Establishment and Improvement Strategy of Aviation Equipment Research on Financial Risk Early Warning of Listed Companies Based on Stochastic Effect Mode Characteristics of Mathematical Statistics Model of Student Emotion in College Physical Education Mathematical Calculus Modeling in Improving the Teaching Performance of Shot Put Application of Nonlinear Differential Equation in Electric Automation Control System Nonlinear strategic human resource management based on organisational mathematical model Higher Mathematics Teaching Curriculum Model Based on Lagrangian Mathematical Model Optimization of Color Matching Technology in Cultural Industry by Fractional Differential Equations The Marketing of Cross-border E-commerce Enterprises in Foreign Trade Based on the Statistics of Mathematical Probability Theory The Evolution Model of Regional Tourism Economic Development Difference Based on Spatial Variation Function The Inner Relationship between Students' Psychological Factors and Physical Exercise Based on Structural Equation Model (SEM) Fractional Differential Equations in Sports Training in Universities Higher Education Agglomeration Promoting Innovation and Entrepreneurship Based on Spatial Dubin Model