Open Access

Higher education innovation and reform model based on hierarchical probit


Cite

Introduction

As the old saying goes, ‘everything is inferior, only reading is high.’ It can be seen that receiving education in Chinese traditional culture is a noble and beautiful thing. Only by passing the imperial examination can the poor students in ancient times leave the bottom of society and realize their ideals in life. Since the founding of New China, especially after the college entrance examination reform in 1977, Chinese education has developed rapidly [1]. From a qualitative point of view, Chinese per capita years of education reached 10.05 years in 2014, compared with 6.38 years in 1985. In terms of quantity, the number of illiterate Chinese has dropped from 22.23% 20 years ago to 4%. Chinese social civilization has reached a new height with the development of education, and social values have also undergone great changes. The spiritual pursuit has received more and more attention. The topic of ‘happiness’ remains high.

Education in China entrusts the hope of hundreds of millions of families for a better life and shoulders the ultimate task of the personal pursuit of a happy life. On the one hand, education can improve personal self-cultivation and give people a sense of spiritual happiness [2]. On the other hand, people's happiness is indirectly improved through objective conditions such as rising income levels and social status. However, as social competition intensifies, the rate of return to education decreases. This leads to a reduction in the indirect effect of education on happiness. According to the ‘National Happiness Report 2014’ released by the China Family Finance Research and Research Center, the highest happiness index is for primary school graduates. PhDs’ happiness is not even as good as that of illiterate students. How to effectively guide and correctly play the function and role of education in social development and transformation is very important.

Literature review

Since the publication of Easterlin's article on the relationship between changes in American intertemporal income and changes in subjective well-being in 1974, subjective well-being has gradually attracted a large number of economists. Subjective well-being has always been an important subject in sociology and psychology, and the impact of education on well-being has always been a concern [3]. There are different opinions in the literature on the influence of education level on residents’ happiness. Some scholars believe that education can improve individual subjective well-being, and some literature believes that education does not affect subjective well-being) or even has a negative effect. Some scholars have found that people with secondary education have the highest level of happiness, not those with higher education. This may be because people with a high level of education have higher expectations of the future, and subjective well-being will decrease when there is a large gap between reality and expectations. Some scholars have explored the impact of urban residents’ educational level on income and subjective well-being. When the income level is not controlled, the education level has a positive effect on happiness. Once the income level is controlled, the education level harms happiness. Therefore, it is concluded that the effect of education level on happiness is mainly realized by increasing income level. Some scholars use survey data of residents in 50 states in the United States to study the impact of education, health, and income on residents’ well-being [4]. The study found that subjective well-being and education level still have significant positive effects after controlling income and health.

The article uses the semi-parametric estimation method of the ordered probit model to test education. It is based on the 2014 Chinese Family Tracking Survey (CFPS2014) data on the impact of individual residents’ subjective well-being and urban-rural differences. The article explores its influence mechanism.

Samples and data sources

To examine the impact of education on residents’ happiness, we used the 2014 China Family Tracking Survey (CFPS) data to conduct empirical research. The CFPS survey questionnaire has four main questionnaire types: community questionnaire, family questionnaire, adult questionnaire, and child questionnaire. We consider that the article mainly studies happiness [5]. Minors’ evaluation of their well-being has different influencing factors from adults. Therefore, this article limits the sample scope to adults, and 27,992 valid samples are obtained after screening.

First of all, the main research object of the article is the education level and happiness of residents. We measured the level of education based on the answer to the question ‘When did you leave school last?’ in the CFPS questionnaire to obtain the variable education level. The number of people with junior high school education and below accounted for 79.46% in the sample. It can be said that those with high school education and above belong to the high-educated population. Therefore, this article constructs another indicator of education level to measure education level. When the respondent's education is high school and above, it is recorded as 1, and the rest is 0. The measure of residents’ subjective well-being comes from the question, ‘how happy do you feel about yourself.’ Respondents scored 1–10. The higher the score, the stronger the sense of happiness. It can be seen from Figure 1a that the overall happiness level of the interviewed residents is relatively high. When happiness is relatively low, the proportion of those with a high school education or lower is higher than those with a high school education or above [6]. This shows that people with low education are more likely to feel unhappy when the happiness score is higher, significantly more people with a high school degree or above than those with a high school degree or less. However, among those who feel very happy (10 points), more people with a high school degree or less than those with a high degree. Taking into account that the happiness score is too subjective, and there may be reliability bias in the precise stratification, we divide the 10 points into three segments: 1 to 4 points to indicate dissatisfaction, 5 to 7 points to indicate fairness, and 8 to 10 points to indicate very satisfied. It can be seen from Figure 1b that the majority of people with a high sense of happiness are still those with a high degree of education.

Fig. 1

Comparison of the impact of academic qualifications on happiness scores

In addition, to accurately judge the impact of education on residents’ happiness, a more in-depth model analysis is needed because of the many factors that affect residents’ happiness. Factors include education, urban and rural areas, gender, age, marriage, health, income, work status, etc. These factors can represent personal characteristics and have important effects on happiness. Descriptive statistics are shown in Table 1.

Descriptive statistics of each variable

Variable Average value Standard deviation Minimum Max

Happiness 2.523 0.621 1 3
Age 47.832 15.78 16 99
Gender 0.489 0.5 0 1
Edu 2.5 1.304 1 8
Marriage 0.837 0.369 0 1
Health 3.015 1.247 1 5
Job 0.739 0.439 0 1
Status 2.939 0.999 1 5
Health_change 3.44 1.226 1 5
Inc 8799.7 19125.7 0 442,000
Empirical results and analysis
Model estimation

The ordered probit model uses observable ordered response data to establish a model to study the changing law of unobservable latent variables. It is a special case of the restricted dependent variable model [7]. The happiness degree variable studied in this article has no specific sample data, so it is also a kind of latent variable, and its influence equation is expressed in linear form as follows. inci*=α0+xiα+ui inc_i^* = {\alpha _0} + x_i^\prime \alpha + {u_i} Where i is the education level. inci* inc_i^* represents the degree of happiness. xi is a set of explanatory variable vectors that may affect the degree of happiness. α is the corresponding unknown coefficient. α0 is a constant term. ui is a random disturbance term with a mean value of 0 and a variance of δ2. Although the sample in this article inci* inc_i^* cannot be directly observed, it falls into four connected segments (denoted as 0, 1, 2, and 3 respectively) and can be observed. We use the variable yi to denote. The relationship between yi and inci* inc_i^* is: y={0,0<inci*1.51,1.5<inci*22,2<inci*2.53,inci*>2.5 y = \left\{ {\matrix{ {0,} \hfill & {0 < inc_i^* \le 1.5} \hfill \cr {1,} \hfill & {1.5 < inc_i^* \le 2} \hfill \cr {2,} \hfill & {2 < inc_i^* \le 2.5} \hfill \cr {3,} \hfill & {inc_i^* > 2.5} \hfill \cr } } \right. The probability that yi takes each value j is: P(yi=0|xi)=P(ui1.5a0xia)=F(λ1xiβ)P(yi=1|xi)=P(1.5a0xia<ui2xiα)=F(λ2xiβ)F(λ1xiβ)P(yi=2|xi)=P(2a0xia<ui2.5xiα)=F(λ3xiβ)F(λ2xiβ)P(yi=3|xi)=P(ui>2.5a0xia)=1F(λ3xiβ)λ1=(1.5a0)/δ,λ2=(2a0)/δ,λ3=(2.5a0)/δ,β=α/β,F() \matrix{ {P\left( {{y_i} = 0|{x_i}} \right) = P\left( {{u_i} \le 1.5 - {a_0} - x_i^\prime a} \right) = F\left( {{\lambda _1} - x_i^\prime \beta } \right)} \hfill \cr {P\left( {{y_i} = 1|{x_i}} \right) = P\left( {1.5 - {a_0} - x_i^\prime a < {u_i} \le 2 - x_i^\prime \alpha } \right) = F\left( {{\lambda _2} - x_i^\prime \beta } \right) - F\left( {{\lambda _1} - x_i^\prime \beta } \right)} \hfill \cr {P\left( {{y_i} = 2|{x_i}} \right) = P\left( {2 - {a_0} - x_i^\prime a < {u_i} \le 2.5 - x_i^\prime \alpha } \right) = F\left( {{\lambda _3} - x_i^\prime \beta } \right) - F\left( {{\lambda _2} - x_i^\prime \beta } \right)} \hfill \cr {P\left( {{y_i} = 3|{x_i}} \right) = P\left( {{u_i} > 2.5 - {a_0} - x_i^\prime a} \right) = 1 - F\left( {{\lambda _3} - x_i^\prime \beta } \right)} \hfill \cr {\;\;\,\;\;\;\;\;\;\;\;\;\;\;\;\;{\lambda _1} = \left( {1.5 - {a_0}} \right)/\delta ,\quad {\lambda _2} = \left( {2 - {a_0}} \right)/\delta ,\quad {\lambda _3} = \left( {2.5 - {a_0}} \right)/\delta ,\quad \beta = \alpha /\beta ,\quad F( \cdot )} \hfill \cr } The above equations are the ε ≡ u/δ distribution function, respectively. If we write yi*=(inci*a0)/δ y_i^* = (inc_i^* - {a_0})/\delta , then the ordered models (1) and (2) have the same probability distribution P(yi = j|xi) as the following model. yi*=xiβ+εi,yi={0,yi*λ11,λ1<yi*λ22,λ2<yi*λ33,yi*>λ3 y_i^* = x_i^\prime \beta + {\varepsilon _i},\quad {y_i} = \left\{ {\matrix{ {0,} \hfill & {y_i^* \le {\lambda _1}} \hfill \cr {1,} \hfill & {{\lambda _1} < y_i^* \le {\lambda _2}} \hfill \cr {2,} \hfill & {{\lambda _2} < y_i^* \le {\lambda _3}} \hfill \cr {3,} \hfill & {y_i^* > {\lambda _3}} \hfill \cr } } \right. Among them ε is a disturbance term with a mean value of zero and a variance of 1, and its distribution function is F(·). Model (3) is a classic ordered response model. It is the standardization of happiness models (1) and (2), and its threshold λ1, λ2, λ3 is also an unknown parameter to be estimated. The log-likelihood function is: lnL(β,λ1,λ2,λ3)=i=1nj=03l{yi=j}ln[F(λj+1xiβ)F(λjxiβ)] \ln L\left( {\beta ,{\lambda _1},{\lambda _2},{\lambda _3}} \right) = \sum\limits_{i = 1}^n \sum\limits_{j = 0}^3 l\left\{ {{y_i} = j} \right\}\ln \left[ {F\left( {{\lambda _{j + 1}} - x_i^\prime \beta } \right) - F\left( {{\lambda _j} - x_i^\prime \beta } \right)} \right] Among them, λ4 = ∞, λ0 = −∞, λ1, λ2, λ3 is the same as above and 1{·}is an indicator function. It is 1 when the conditions in the brackets are satisfied. Otherwise, it is 0. From the above derivation, it can be seen that the estimated β differs from the coefficient α in the original happiness model (1) and (2) by a positive multiple, but the probability of happiness falling into each small section remains unchanged. Therefore, the estimated results of the standardized model (3) can be used to analyze the influence of various factors on the degree of happiness. When ε obeys the standard normal distribution, F in the above likelihood function is replaced by the cumulative distribution function of the standard normal distribution, which is the parameter estimation of the Probit model [8]. When ε does not obey the standard normal distribution, such estimation will lead to inconsistent estimators of the parameters. The correction method is to use the semi-parametric method.

Regarding the distribution of as unknown, the Hermit sequence fk(ε)=1c(s=0kγsεs)2ϕ(ε) {f_k}(\varepsilon ) = {1 \over c}{\left( {\sum\limits_{s = 0}^k {\gamma _s}{\varepsilon ^s}} \right)^2}\phi (\varepsilon ) is used to approximate the density function of ε. Where c is the standardization factor in ensuring that the integral of fk(ε) is equal to 1. The coefficient F is also the parameter to be estimated. ϕ(·) is the density function of the standard normal distribution. The γs in the likelihood function is replaced by the cumulative distribution function fk(·) that approximates the distribution. The estimation of parameter β is obtained by solving the problem of maximizing the quasi-likelihood function. It can be proved that this estimator is consistent and asymptotically normal under weaker conditions. When k = 2, the estimation is the same as the parameter estimation of the ordered Probit model, so the likelihood ratio test can be used to select k or explain the necessity of semiparametric estimation.

If the degree of non-employment education is not in the research sample, the decision-making selection of education participation will lead to sample selection problems [9]. To eliminate potential sample selection bias, an ordered sample selection model should be used to estimate the impact of various factors on happiness. Suppose the education participation equation is: Di=1{γ0+ziγυi>0} {D_i} = 1\{ {\gamma _0} + z_i^\prime \gamma - {\upsilon _i} > 0\} The resulting equation is model (3), that is, only when people participate in education (Di = 1), the happiness level falls into a certain section j. Here vector z is a vector of factors that affect the selection. In the model z contains at least one variable that is not included in x. Because the two types of factors contained in z are not all contained in x, the model satisfies the recognition conditions [10]. At this time, the probability that the degree of happiness of education involved in education falls in each section can be similarly deduced as: P(yi=j,Di=1|xi,zi)=G(λj+1xiβ,γ0+ziγ)G(λjxiβ,γ0+ziγ) P\left( {{y_i} = j,{D_i} = 1|{x_i},{z_i}} \right) = G\left( {{\lambda _{j + 1}} - x_i^\prime \beta ,{\gamma _0} + z_i^\prime \gamma } \right) - G\left( {{\lambda _j} - x_i^\prime \beta ,{\gamma _0} + z_i^\prime \gamma } \right) Where G(,)is the joint distribution function of ε and υ. At this time, the parameters β, λ1, λ2, λ3 and γ0, γ in the sample selection ordered response models (3) and (5) can be estimated by the maximum likelihood method. Its log-likelihood function lnL(β, γ0, γ, λ1, λ2, λ3) is: i=1n(1Di)ln[1G2(γ0+ziγ)]+i=1nj=03Di1{yi=j}ln[G(λj+1xiβ,γ0+ziγ)G(λjxiβ,γ0+ziγ) \sum\limits_{i = 1}^n \left( {1 - {D_i}} \right)\ln \left[ {1 - {G_2}\left( {{\gamma _0} + z_i^\prime \gamma } \right)} \right] + \sum\limits_{i = 1}^n \sum\limits_{j = 0}^3 {D_i}1\{ {y_i} = j\} \ln [G\left( {{\lambda _{j + 1}} - x_i^\prime \beta ,{\gamma _0} + z_i^\prime \gamma } \right) - G\left( {{\lambda _j} - x_i^\prime \beta ,{\gamma _0} + z_i^\prime \gamma } \right) Where λ4 = ∞, λ0 = −∞, λ1, λ2, λ3 is the same as above. G2(γ0+ziγ)=1P(υiγ0+ziγ)=G(,γ0+ziγ) {G_2}\left( {{\gamma _0} + z_i^\prime \gamma } \right) = 1 - P\left( {{\upsilon _i} \ge {\gamma _0} + z_i^\prime \gamma } \right) = G\left( {\infty ,{\gamma _0} + z_i^\prime \gamma } \right) When ε and υ follow a normal joint distribution with zero mean, unit variance, and correlation coefficient ρ, the joint distribution function G in the above likelihood function is replaced by the two-variable Gaussian distribution function Φ(ε, υ;ρ). The estimation of models (3) and (5) is the estimation of the Probit model. The difference between log-likelihood function (4) and (7) is that the first sum in (6) reflects the contribution of the unemployed education level subsample (Di = 0) to the likelihood function, while (4) ignores this item. If the sample does not have a selection problem, the first sum does not appear.

When ε and υ do not obey the normal joint distribution, applying the maximum likelihood estimation of Φ(ε, υ;ρ) will result in inconsistent parameter estimators. To obtain the consistent estimation of the parameters, the restriction on the joint distribution G(,) of ε and υ should be relaxed, and the semiparametric method should be adopted.

Similar to the semi-parametric estimation method of Probit model (3), the semi-parametric estimation of sample selection ordered Probit model (3) and (5) regards the joint distribution of and as unknown. We use the Hermite function sequence. f*(ε,υ;θ)=1ψR(θ)τR(ε,υ;θ)2ϕ(ε)ϕ(υ) {f^*}(\varepsilon ,\upsilon ;\theta ) = {1 \over {{\psi _R}(\theta )}}{\tau _R}{(\varepsilon ,\upsilon ;\theta )^2}\phi (\varepsilon )\phi (\upsilon ) Approximate the joint density function of ε and υ. Where τR(ε, υ; θ) is the R = (R1, R2) order polynomial about ε, υ, and θ is the parameter vector to be estimated constructed by R1, R2 unknown coefficients. ψR(θ) is the standardization factor in ensuring that the integral of f* is equal to 1. From this, the approximation function G*(ε, υ; θ) of the joint distribution function G of ε and υ can be obtained. We replace Gin the likelihood function (7) with the approximate distribution function G*(ε, υ; θ). The semi-parametric estimator of the parameter vector (β, γ0, γ, λ1, λ2, λ3, θ) is obtained by solving the quasi-likelihood maximization problem. It can be proved that as long as R1, R2 increases with sample size, n this estimator is consistent and asymptotically normal. The difference from the semi-parametric estimation of the model (3) is that the model corresponding to R1, R2 that does not increase simultaneously has no nesting. For example, the two approximation models corresponding to (R1, R2) = (2, 2) and are not nested, so the likelihood ratio test is not suitable for selecting approximation models of different orders. However, we can choose R1, R2 using AIC or BIC criteria [11]. This can also explain the necessity of semi-parametric estimation.

Considering that happiness is an ordered dependent variable, we use an ordered probit model to estimate. To ensure the consistency of parameter estimation, we use the Hermit sequence to simulate the residual distribution to modify the ordered Probit model. This results in a consistent estimate of the parameters.

We use a semi-parametric estimation method based on the ordered probit model to test the impact of education on happiness. When k 4, according to the likelihood ratio test (LRtest), it can be judged that the semi-parametric estimation result is better than the parameter estimation result, and the estimated parameter does not change significantly, so k=4 is selected. The estimated results are shown in Table 2.

Semiparametric ordered probit model estimation results

Full sample (1) Full sample (2) Town (3) Rural (4)

Edu 0.102*** (0.013) 0.119*** (0.016) 0.079*** (0.016) 0.145*** (0.021)
Urban 0.214*** (0.028) 0.234*** (0.030)
Age 0.055*** (0.005) −0.055*** (0.005) −0.059*** (0.007) −0.052*** (0.006)
Age2 0.001*** (0.000) 0.001*** (0.000) 0.001*** (0.000) 0.001*** (0.000)
Edu2_urban −0.105* (0.044)
Control Var Control Control Control Control
Log likelihood −22717.567 −22714.489 −9861.211 −12831.042
LR 32.666 32.193 17.315 34.220
p 0.000 0.000 0.000 0.000
Standard deviation 1.465 1.465 1.476 1.521

Notes: *, **, *** indicate that the pair is significant at the level of 5%, 1%, and 0.1%, respectively. The standard deviation is in parentheses. P-value is the P-value of the LR statistical test. The sample size is 27992.

Results (1) and (2) are estimated results for the entire sample. Results (1) show that the improvement of education level can significantly promote residents’ happiness, and there are significant differences in happiness between urban and rural residents. Due to Chinese long-term urban-rural dual structure, urban and rural residents show significant differences in economic conditions and ideological concepts. So is there a significant difference in the effect of education on the happiness of urban and rural residents? We add the intersection of education and town, as shown in result (2). The coefficient of the cross term is negative, and the test is significant [12]. This shows that education is not as effective in improving the happiness of urban residents as it does for rural residents. This may be because the popularization of education for urban residents is earlier than that for rural residents, and the popularity is also higher. According to the sample data, the proportion of rural senior high school students and above accounted for 10.3%, while the proportion in urban areas was 32.05%. According to the principle of vagueness, the scarcity of high education makes rural residents happier than urban residents.

Further estimates are made on the urban and rural samples separately. Results (3)(4) are estimated results using OP non-parametric model. It can be seen that the coefficient of the education variable in the urban sample is about half of the coefficient in the rural sample, and there is a big difference between the two.

The path of higher education's influence on happiness

From the results of the above model, it can be seen that education can significantly improve individuals’ subjective well-being. For those who wish to receive education, education itself is happy. Education is willing to improve people's cognition, helps cultivate correct world outlook, values, and enhances self-evaluation of happiness. On the other hand, existing research proves that education indirectly impacts happiness through economic and other factors. From the description of sample statistics, we can see that the proportion of people with high school education and above is only 20.54%. We classify this type of people as those who have received higher education [13]. We mainly explore the mechanism of higher education's influence on individual subjective well-being and still adopt the non-parametric estimation of the orderly corresponding probit model. It can be seen from the previous results that this method is superior to ordinarily ordered probit. The results of judging the impact path by introducing cross-terms are shown in Table 3.

The estimation result (5) uses the binary variable higher education (edu2) instead of education (Edu), and the coefficient is still significantly positive. This also shows the robustness of the model. The estimation results (6)–(9) were added to the cross items of education and income (logarithm), health, work status, and status, and the statistical tests of each cross item were significant. The coefficient and significance of higher education in the estimation result (6) have not changed much. This shows that higher education cannot promote the individual happiness of residents by increasing personal income. As a result, the signs of higher education coefficients of (7), (8), and (9) became negative and no longer significant. This shows that higher education can improve the individual's subjective well-being by improving the individual's health, work status, and status.

Introducing cross-terms to determine the impact path results

(5) (6) (7) (8) (9)

Edu2 0.162*** (0.031) 0.117** (0.036) −0.005 (0.036) −0.016 (0.036) −0.078* (0.039)
Urban 0.249*** (0.029) 0.245*** (0.029) 0.219*** (0.028) 0.223*** (0.028) 0.224*** (0.028)
Age −0.055*** (0.005) −0.055*** (0.005) −0.055*** (0.005) −0.053*** (0.004) −0.055*** (0.005)
Age2 0.001*** (0.000) 0.001*** (0.000) 0.001*** (0.000) 0.001*** (0.000) 0.001*** (0.000)
Control Var Control Control Control Control Control
edu_linc 0.004* (0.002)
Edu health 0.026*** (0.004)
Edu job 0.107*** (0.015)
Edu status 0.039*** (0.005)
Log likelihood −22755.746 −22753.389 −22731.603 −22724.076 −22715.474
LR 35.960 33.352 38.652 350545 28.105
P 0.000 0.000 0.000 0.000 0.000
Standard deviation 1.453 1.448 1.447 1.451 1.442

Notes: *, **, *** indicate that the pair is significant at the level of 5%, 1%, and 0.1%, respectively. The standard deviation is in parentheses. P-value is the P-value of the LR statistical test. The sample size is 27992.

Conclusion

The article uses the semi-parametric estimation method of the ordered corresponding Probit model to test the impact of education on individual subjective well-being empirically. The estimation results of the model show that the improvement of education level can significantly improve the individual's subjective well-being, and there are significant urban-rural differences. The improvement of rural residents’ happiness is significantly bigger than that of urban residents. Further research has shown that improving health, getting a job, and improving social status are the three channels for higher education to increase residents’ well-being. Higher education cannot improve residents’ happiness by increasing income.

eISSN:
2444-8656
Language:
English
Publication timeframe:
2 times per year
Journal Subjects:
Life Sciences, other, Mathematics, Applied Mathematics, General Mathematics, Physics