1. bookAHEAD OF PRINT
Détails du magazine
License
Format
Magazine
eISSN
2444-8656
Première parution
01 Jan 2016
Périodicité
2 fois par an
Langues
Anglais
Accès libre

Mathematical Modeling and Forecasting of Economic Variables Based on Linear Regression Statistics

Publié en ligne: 15 Jul 2022
Volume & Edition: AHEAD OF PRINT
Pages: -
Reçu: 19 Jan 2022
Accepté: 14 Mar 2022
Détails du magazine
License
Format
Magazine
eISSN
2444-8656
Première parution
01 Jan 2016
Périodicité
2 fois par an
Langues
Anglais
Abstract

Many economic variables are interdependent, restrictive, and influential. Finding the law of change between economic variables and influencing factors and expressing this law in mathematical expressions will bring great convenience to forecasting. A statistical analysis method that uses mathematical equations to determine the quantitative relationship between two or more variables. This is more commonly used when estimating and predicting the value of the dependent variable. The article analyzes the data on the National Bureau of Statistics website and uses the method of multiple linear regression to fit the graphs of economic indicators. Finally, the forecast data is analyzed in detail. We evaluated the modeling method of the prediction model and the credibility of the prediction data from a practical level.

Keywords

MSC 2010

Introduction

At present, the main contradictions in our society are prominently manifested as unbalanced and insufficient development. The most prominent “shortcoming” in China's economic and social development is the imbalance of regional development. We need to focus on solving the problem of uncoordinated development [1]. The purpose is to reduce the difference in China's economic development caused by regional influences and make the regional economy develop in a coordinated manner.

In recent years, many scholars have viewed China's overall economic development from different perspectives. In particular, the economic development of China's Pearl River Delta has been deeply studied. The Pearl River Delta region has also achieved rapid development. Therefore, we establish multiple regression analysis models based on the influence of different indicators on the economic development of the western region [2]. After analysis, the main factors that will have a major impact on the economic development of the western region in the new period are drawn. This provides certain development ideas for the rapid development of future economic construction in the west and China's Silk Road Economic Belt.

Principles of Multiple Linear Regression Analysis

Regression analysis is essentially a mathematical process. It quantitatively describes the correlation between variables through mathematical expressions between variables. The mathematical model is: Y=β0+β1X1+β2X2++βnXn+ε Y = {\beta _0} + {\beta _1}{X_1} + {\beta _2}{X_2} + \cdots + {\beta _n}{X_n} + \varepsilon

The formula (1) represents a typical n element linear regression model. The regression coefficients are regression constants and partial regression coefficients. Multiple linear regression analysis has certain prerequisites. It is required that the explanatory variables do not have multiple collinearity relationships [3]. Therefore, we need to diagnose the multicollinearity of the model. We must ensure that the independent variable has a significant impact on the dependent variable and has a close correlation to make the model achieve the desired effect.

F test method

Since the actual observation value of the dependent variable Y fluctuates up and down, the magnitude of the fluctuation for each observation value can be represented by the dispersion YY¯ Y - \bar Y . We generally use the sum of square deviations (YY¯)2 {\sum {\left( {Y - \bar Y} \right)} ^2} to represent the n total fluctuations of the observations. The E-test method is based on the decomposition of the sum of squared deviations. The dispersion of each observation point can be decomposed into two parts: YY¯=(YY*)+(Y*Y¯) Y - \bar Y = \left( {Y - {Y^*}} \right) + \left( {{Y^*} - \bar Y} \right) . (YY*) is called residual dispersion, and (Y*Y¯) \left( {{Y^*} - \bar Y} \right) is called regression dispersion. Use Lyy to denote the sum of squared deviations (YY¯)2 {\sum {\left( {Y - \bar Y} \right)} ^2} , then: Lyy=(YY¯)2=[(YY*)+(YY¯)]2 {L_{yy}} = {\sum {\left( {Y - \bar Y} \right)} ^2} = {\sum {\left[ {\left( {Y - {Y^*}} \right) + \left( {Y - \bar Y} \right)} \right]} ^2}

There is an intermediate-term (YY*)(Y*Y¯)=0 \sum {\left( {Y - {Y^*}} \right)\left( {{Y^*} - \bar Y} \right) = 0} . We call it the residual sum of squares and denote it as Q [4]. The part of the linear relationship between X and Y that causes the change of the dependent variable Y is called the regression sum of squares and is denoted as U. In this way, the decomposition of the sum of squared deviations can be expressed as Lyy = Q + U. exist: Lyy=(YY¯)2=Y21n(Y)2,U=LyyQbLxy {L_{yy}} = \sum {{{\left( {Y - \bar Y} \right)}^2}} = \sum {{Y^2} - {1 \over n}{{\left( {\sum Y } \right)}^2},U = {L_{yy}} - QbLxy} Q=(YY*)2=LyybLxy Q = \sum {{{\left( {Y - {Y^*}} \right)}^2} = {L_{yy}} - bLxy}

Since Y has a total of n observation data, the degree of freedom of the sum of square deviations is n − 1. The degree of freedom of the regression sum of squares is determined by the number of independent variables X. Because the unary linear regression equation has only one independent variable, the degree of freedom of the regression equation is 1. In the total sum of square deviations Lyy, a large Q means that U is small. The smaller the U, the lower the linear correlation between the variables. U is the smallest if and only if b = 0 is. It can be seen that to test whether the two variables of the population are truly linearly related, you can test whether the regression coefficient b of the population is equal to zero. We can propose hypothesis testing.

Null hypothesis H0 : b = 0, alternative hypothesis H1 : b ≠ 0, and then select statistics: F=UQ/(n2)F(1,n2) F = {U \over {Q/\left( {n - 2} \right)}}F\left( {1,n - 2} \right) .

t test method

The t test is used to evaluate the explanatory power of a single independent variable. We use it to test the significance test of the influence of each independent variable on the dependent variable [5]. If an independent variable Xi has no significant effect on the dependent variable, it should be excluded from the empirical data used to determine the regression equation.

We select the calculation formula of the test statistic as t=b(YY*)2(n2)(XX¯)2ta2(n2) t = {b \over {\sqrt {{{\sum {{{\left( {Y - {Y^*}} \right)}^2}} } \over {\left( {n - 2} \right)\sum {{{\left( {X - \bar X} \right)}^2}} }}} }}{t_{{a \over 2}}}\left( {n - 2} \right)

Selection of Indicators for Economic Development Research in the Western Regions

We use per capita GDP (yuan) Y as an explanatory variable to analyze the factors affecting the economic development of the western region. In this way, reasonable suggestions are made for the future economic development of the western region [6].

This article uses the economic development data of the western region from 2016 to 2021 as a sample. At the same time, we use the method of empirical analysis to identify the factors that affect the gross regional product. Establish multiple linear regression models and use SPSS software for analysis [7]. Table 1 shows the original data of the explanatory variables and the explanatory variables. Analysis of the data in Table 1 shows that the western region's economy has developed rapidly during the six years from 2016 to 2021. Per capita GDP has continued to grow with an average annual growth rate of around 3,000 yuan.

Western Economic Development Indicators (Unit: 100 million yuan)

Years 2016 2017 2018 2019 2020 2021
GDP per capita (yuan) 30211 33865 36702 38369 40773 43922
Added value of primary industry (100 million yuan) 888.74 921.67 960.8 1017.49 1119.28 1172.86
Added value of the secondary industry (100 million yuan) 1867.59 2181.74 2447.25 2490.43 2584.25 2674.7
Added value of tertiary industry (100 million yuan) 1917.84 2180.54 2357.82 2553.32 2770.85 3174.84
Port cargo throughput (100 million yuan) 21087 22431 24640 26860 30509 33434
Investment in fixed assets (100 million yuan) 1483.32 2054.77 2533.33 3120.31 3298.27 3597.47
Proportion of regional GDP in public budget expenditure (%) 10.98 11.32 11.56 15.34 14.36 14.49
Proportion of urban population to permanent population (%) 39.72 40.45 41.03 42.01 42.68 43.52
Model building and data analysis

According to the selected data indicators, the per capita GDP of the western region is finally determined as the explained variable. The port cargo throughput, fixed asset investment, the proportion of public budget expenditure to total GDP, and the proportion of urban population to permanent population are multiple linear regression models with explanatory variable X1, X2, X3, X4, X5, X6, X7, respectively. At the same time, the linear model of variables Y and X is: Y=β0+β1X1+β2X2+β3X3+β4X4+β5X5+β6X6+β7X7+ε Y = {\beta _0} + {\beta _1}{X_1} + {\beta _2}{X_2} + {\beta _3}{X_3} + {\beta _4}{X_4} + {\beta _5}{X_5} + {\beta _6}{X_6} + {\beta _7}{X_7} + \varepsilon

Correlation analysis

The article uses this to measure the closeness between variable factors. Analyze the data through the correlation analysis in SPSS statistical analysis software. From this, the correlation coefficient table is obtained and the correlation between each explanatory variable is analyzed [8]. Among them, the correlation coefficients between the four independent variables of tertiary industry added value, port cargo throughput, fixed asset investment, and the proportion of urban population to permanent population and per capita GDP are all greater than 0.9. At the same time, the probability of significant correlation is close to 0.000 and less than 0.01. Whether there is a quantitative linear relationship between GDP per capita and the four main factors that affect economic development, this is further verified by establishing an appropriate mathematical operation model. We can use the stepwise regression analysis method to perform multiple linear regression analysis on the data related to the above four variables.

Stepwise regression analysis

Establish a model through SPSS software to perform linear regression analysis on related data. In this method, a summary table of the output data model (Table 2) is obtained through step-by-step analysis. We perform stepwise linear regression on the sample data. After analysis, a model can be fitted [9]. The adjusted R2 value is 1.000, which indicates a good fit. As a result, indicators such as the proportion of the urban population in the permanent population, the proportion of public budget expenditures in GDP, the value-added of the secondary industry, the value-added of the primary industry, and changes in port cargo throughput indicators can have a certain impact on the per capita GDP growth in the western region.

Model summary

Model R R party Adjust R side Standard estimate error
1 1.000a 1 1.000. .

Table 3 is a residual statistical table showing the predicted value and residual value based on the 3σ principle. By plotting the sample data to observe the cumulative probability P-P diagram (Figure 1). The scattered points presented by the sample data from the bottom left to the top right are basically a straight line trend. It can be concluded that the sample distribution basically obeys the normal distribution.

Residual error statistics

Minimum Maximum Mean Standard deviation N
Predictive value 30222.0527 43921.0664 37307 4888.03521 6
Residual −23.40751 25.44731 0 16.66639 6
Standard forecast −1.449 1.353 0 1 6
Standard residual −0.888 0.966 0 0.632 6

Figure 1

Standard P-P plot of regression standardized residuals

Model establishment

According to the above analysis, the regression coefficient table can be obtained (Table 4). The table mainly presents the regression coefficients of the regression model and the significance test data. From this table, the regression coefficients of the five independent variables in the model are −12.928, 3.749, 0.23, −501.736, and 4165.997 respectively. We can get the regression equation by substituting them into the linear regression model: Y=125757.96312.728X1+3.749X2+0.23X4501.736X6+4165.997X7 Y = - 125757.963 - 12.728{X_1} + 3.749{X_2} + 0.23{X_4} - 501.736{X_6} + 4165.997{X_7}

Regression coefficients

Model Non-standardized coefficient Standard coefficient t Sig.
B Standard error Trial version
−125757.963 0 0 0
(constant) −12.928 0 −0.298 0 0
Primary industry production value 3.749 0 0.229 0 0
Secondary industry production value 0.023 0 0.023 0 0
Port cargo throughput −501.736 0 −0.198 0 0
Public budget expenditure as a percentage of GDP 4165.997 1.217 0 0

It can be seen from the established model that when all the explanatory variables in the model are 0, the constant term coefficient, that is, the per capita GDP of the western region is negative. The value-added of the primary industry is negatively correlated with the ratio of public budget expenditure to GDP, which is in the same direction as changes in the value-added of the secondary industry, port cargo throughput, and the ratio of urban population to permanent population [10]. When other variables remain unchanged, when the output value of the secondary industry increases by 1 unit, the average GDP per capita increases by 3.749. If other variables remain unchanged, when the port cargo throughput increases by 1 unit, the per capita GDP will increase by 0.023. It is also assumed that other variables remain unchanged. The urban population in the current year's proportion of the permanent population increased by an average of 4,165.997 per capita GDP per increase.

From the above analysis results, it can be seen that the regression model has a good degree of fit. The secondary industry value-added, port cargo throughput, the ratio of urban population to permanent population and the per capita GDP of the western region have a significant linear relationship. As the production value of the primary industry and the share of public budget expenditures in GDP increase, per capita GDP will continue to decline. Through the analysis of the comprehensive development of the western region, we can see that the government's public budget expenditure is unreasonable. But it did not promote the growth of regional GDP.

Suggestions
Reducing the proportion of public budget expenditures

The government should make regular adjustments to public budget expenditures. Emphasis should be placed on some important, abnormal and critical differences that are not in compliance with daily norms encountered in budget execution. The purpose of this is to promote the better development of regional economic production, and to increase the income level of the people and improve the living conditions of the residents. Specifically, it can be expressed as regulating fiscal revenue and expenditure and stabilizing the relationship between the government and the market to ensure that the scale of fiscal revenue is controlled within an appropriate range [11]. At the same time, effective operation and management measures are taken to ensure that the budget goals are achieved. In this way, economic optimization can be achieved, and fiscal expenditures can be reduced, thereby effectively promoting the comprehensive, coordinated and sustainable development of the western region's economy.

Increasing the urbanization transfer construction of rural population

In the process of the new urbanization in the new era, the traditional economic growth mode that consumes resources and destroys the environment can no longer adapt to the development of the new era. Therefore, environmental issues must be taken into consideration while developing the economy. We need to maintain a continuous innovation drive, and continue to improve science and technology to optimize the industrial structure and determine the optimal ratio of capital, labor, and resource input. We must expand investment in the construction of new-type rural urbanization, gradually promote the large-scale development of agricultural industrialization, and promote the development of a characteristic economy and a green rural economy. We need to scientifically configure the relevant infrastructure, adapt measures to local conditions, and reasonably plan the spatial layout to promote the steady development of the western economy.

Optimizing and adjusting the industrial structure

While enjoying the joy of huge economic benefits brought about by newly introduced large-scale industrial projects, they must also bear the negative costs of environmental and resource consumption brought about by them. Therefore, in the process of promoting the development of the western region in the new era, advocating the integration of technological development and industry is also an indispensable part.

Conclusion

The proportion of public budget expenditure to GDP and the proportion of urban population to permanent population have a great impact on the per capita GDP of the western region and there is a linear relationship. For policies related to the economic development of the western region in the new period, the proportion of public budget expenditures should be adjusted to increase the investment in the transfer of rural population to urbanization.

Figure 1

Standard P-P plot of regression standardized residuals
Standard P-P plot of regression standardized residuals

Regression coefficients

Model Non-standardized coefficient Standard coefficient t Sig.
B Standard error Trial version
−125757.963 0 0 0
(constant) −12.928 0 −0.298 0 0
Primary industry production value 3.749 0 0.229 0 0
Secondary industry production value 0.023 0 0.023 0 0
Port cargo throughput −501.736 0 −0.198 0 0
Public budget expenditure as a percentage of GDP 4165.997 1.217 0 0

Model summary

Model R R party Adjust R side Standard estimate error
1 1.000a 1 1.000. .

Residual error statistics

Minimum Maximum Mean Standard deviation N
Predictive value 30222.0527 43921.0664 37307 4888.03521 6
Residual −23.40751 25.44731 0 16.66639 6
Standard forecast −1.449 1.353 0 1 6
Standard residual −0.888 0.966 0 0.632 6

Western Economic Development Indicators (Unit: 100 million yuan)

Years 2016 2017 2018 2019 2020 2021
GDP per capita (yuan) 30211 33865 36702 38369 40773 43922
Added value of primary industry (100 million yuan) 888.74 921.67 960.8 1017.49 1119.28 1172.86
Added value of the secondary industry (100 million yuan) 1867.59 2181.74 2447.25 2490.43 2584.25 2674.7
Added value of tertiary industry (100 million yuan) 1917.84 2180.54 2357.82 2553.32 2770.85 3174.84
Port cargo throughput (100 million yuan) 21087 22431 24640 26860 30509 33434
Investment in fixed assets (100 million yuan) 1483.32 2054.77 2533.33 3120.31 3298.27 3597.47
Proportion of regional GDP in public budget expenditure (%) 10.98 11.32 11.56 15.34 14.36 14.49
Proportion of urban population to permanent population (%) 39.72 40.45 41.03 42.01 42.68 43.52

Ge, Y., & Wu, H. (2020). Prediction of corn price fluctuation based on multiple linear regression analysis model under big data. Neural Computing and Applications., 2020; 32(22): 16843–16855 GeY. WuH. 2020 Prediction of corn price fluctuation based on multiple linear regression analysis model under big data Neural Computing and Applications 2020 32 22 16843 16855 10.1007/s00521-018-03970-4 Search in Google Scholar

Knoblach, M., Roessler, M., & Zwerschke, P. The Elasticity of Substitution Between Capital and Labour in the US Economy: A Meta-Regression Analysis. Oxford Bulletin of Economics and Statistics., 2020; 82(1): 62–82 KnoblachM. RoesslerM. ZwerschkeP. The Elasticity of Substitution Between Capital and Labour in the US Economy: A Meta-Regression Analysis Oxford Bulletin of Economics and Statistics 2020 82 1 62 82 10.1111/obes.12312 Search in Google Scholar

Mittal, M., Goyal, L. M., Sethi, J. K., & Hemanth, D. J. Monitoring the impact of economic crisis on crime in India using machine learning. Computational Economics., 2019; 53(4): 1467–1485 MittalM. GoyalL. M. SethiJ. K. HemanthD. J. Monitoring the impact of economic crisis on crime in India using machine learning Computational Economics 2019 53 4 1467 1485 10.1007/s10614-018-9821-x Search in Google Scholar

Kryeziu, N., & Durguti, E. A. The impact of inflation on economic growth: The case of Eurozone. International Journal of Finance & Banking Studies (2147–4486)., 2019; 8(1): 01–09 KryeziuN. DurgutiE. A. The impact of inflation on economic growth: The case of Eurozone International Journal of Finance & Banking Studies (2147–4486) 2019 8 1 01 09 10.20525/ijfbs.v8i1.297 Search in Google Scholar

Magdalena, S., & Suhatman, R. The Effect of Government Expenditures, Domestic Invesment, Foreign Invesment to the Economic Growth of Primary Sector in Central Kalimantan. Budapest International Research and Critics Institute-Journal (BIRCI-Journal)., 2020; 3(3): 1692–1703 MagdalenaS. SuhatmanR. The Effect of Government Expenditures, Domestic Invesment, Foreign Invesment to the Economic Growth of Primary Sector in Central Kalimantan Budapest International Research and Critics Institute-Journal (BIRCI-Journal) 2020 3 3 1692 1703 10.33258/birci.v3i3.1101 Search in Google Scholar

Önder, I., Weismayer, C., & Gunter, U. Spatial price dependencies between the traditional accommodation sector and the sharing economy. Tourism Economics., 2019; 25(8): 1150–1166 ÖnderI. WeismayerC. GunterU. Spatial price dependencies between the traditional accommodation sector and the sharing economy Tourism Economics 2019 25 8 1150 1166 10.1177/1354816618805860 Search in Google Scholar

Zhao, W., Shi, T. & Wang, L. Fault Diagnosis and Prognosis of Bearing Based on Hidden Markov Model with Multi-Features. Applied Mathematics and Nonlinear Sciences., 2020; 5(1): 71–84 ZhaoW. ShiT. WangL. Fault Diagnosis and Prognosis of Bearing Based on Hidden Markov Model with Multi-Features Applied Mathematics and Nonlinear Sciences 2020 5 1 71 84 10.2478/amns.2020.1.00008 Search in Google Scholar

Du, Q., Li, Y. & Pan, L. Wheelchair Size and Material Application in Human-machine System Model. Applied Mathematics and Nonlinear Sciences., 2021; 6(2): 7–18 DuQ. LiY. PanL. Wheelchair Size and Material Application in Human-machine System Model Applied Mathematics and Nonlinear Sciences 2021 6 2 7 18 10.2478/amns.2021.1.00009 Search in Google Scholar

Mele, M., & Magazzino, C. Pollution, economic growth, and COVID-19 deaths in India: a machine learning evidence. Environmental Science and Pollution Research., 2021; 28(3): 2669–2677 MeleM. MagazzinoC. Pollution, economic growth, and COVID-19 deaths in India: a machine learning evidence Environmental Science and Pollution Research 2021 28 3 2669 2677 10.1007/s11356-020-10689-0747293832886309 Search in Google Scholar

Toumi, S., & Toumi, H. Asymmetric causality among renewable energy consumption, CO2 emissions, and economic growth in KSA: evidence from a non-linear ARDL model. Environmental Science and Pollution Research., 2019; 26(16): 16145–16156 ToumiS. ToumiH. Asymmetric causality among renewable energy consumption, CO2 emissions, and economic growth in KSA: evidence from a non-linear ARDL model Environmental Science and Pollution Research 2019 26 16 16145 16156 10.1007/s11356-019-04955-z30972668 Search in Google Scholar

Haralayya, D., & Aithal, P. S. Performance affecting factors of indian banking sector: an empirical analysis. George Washington International Law Review., 2021; 7(1): 607–621 HaralayyaD. AithalP. S. Performance affecting factors of indian banking sector: an empirical analysis George Washington International Law Review 2021 7 1 607 621 Search in Google Scholar

Articles recommandés par Trend MD

Planifiez votre conférence à distance avec Sciendo