Accesso libero

Application and Effectiveness Evaluation of Big Data Technology in International Students’ Chinese Language Learning as a Foreign Language

  
22 set 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Introduction

Humanity is moving into the era of big data. With the rapid and comprehensive penetration of computer technology into all areas of social life, the amount of storable data and information grows rapidly, accumulating to the extent that it can trigger changes, and the era of big data comes into being [12]. The application of big data in the field of education has a broad prospect. Educational big data refers to the massive amount of data in the field of education, all kinds of data generated in the process of the entire educational management and educational activities, which can be collected and processed to play a valuable role in the development of education, and evaluation and feedback of the entered data through big data analysis technology [36]. Schools can then accurately grasp information in all aspects, discover problems and deficiencies in a timely manner, and adjust the teaching management system and methods. Teachers can also find out the problems of students in time, improve the curriculum design or personalized teaching plan in a targeted manner, and improve the teaching methods and teaching ability while helping students develop comprehensively and improve their academic performance [79]. With the increasing international status and influence, Chinese language and culture have gained more attention from more people, gradually causing a global craze for learning Chinese, and more and more people want to learn about Chinese language, characters and culture. More and more people want to learn Chinese language, characters and culture. This makes “Chinese language fever” become a common phenomenon [1012]. Nowadays, Chinese language education for foreigners is booming, teaching Chinese is not only a teaching activity, but also a way to spread Chinese culture. Chinese language education for foreigners should also seize the opportunities brought by the era of big data, think about how to use the advantages of big data to help Chinese language education for foreigners, innovate the mode of teaching curriculum, revolutionize the concept of professional development, and update the way of cultural dissemination [1314].

Literature [15], in order to promote remote collaborative communication in teaching Chinese as a foreign language, reviewed the relevant literature and practices in the last two decades, and discussed the impact of remote collaborative practices on teaching Chinese as a foreign language in terms of four dimensions: organizational skills, pedagogical skills, attitudes and beliefs. Literature [16] examined the effectiveness and usability of scenario-based interactive practice in teaching Chinese expression, where students studying Chinese in the U.S. were video-interacted with characters in Shanghai, and practiced with game-like features that promoted students’ Chinese expression. Literature [17] developed an intelligent information system for teaching Chinese as a foreign language, adopting a teaching optimization algorithm based on the feedback mechanism, arranging students’ independent learning through local search and feedback, improving the ability to find global Chinese teaching resources, and meeting the needs of Chinese learners. Literature [18] evaluates the readability of foreign Chinese books, comprehensively considers the factors affecting the difficulty of reading materials from the Chinese language itself, and extracts the features of foreign Chinese books using a database management system in order to identify and recommend simple teaching materials and reading materials for Chinese learning. Literature [19] investigated the role of learning strategies and motivation on Chinese learning effects in multi-modal learning environments, and utilized Spearman’s correlation coefficient to statistically analyze the investigated Chinese learning students, proving the significant effects of strategies and motivation on Chinese learning in multi-modal Chinese learning. Literature [20] explored the effects of visual aids and the way learning materials are presented on Chinese learners when learning similar Chinese characters. Encoding key image parts of Chinese characters with refined formulas before learning can improve the accuracy of Chinese writing.

In this paper, we first studied the role of data mining technology in the process of Chinese education and teaching, and constructed a big data intelligent learning model for Chinese as a foreign language from the aspects of learning objectives, learning process and learning counseling. Then, a regression model was established to analyze students’ overall satisfaction with Chinese language teaching, as well as to compare the changes in students’ Chinese language performance before and after the intervention of big data analysis technology. Finally, the differences between the experimental class and the control class in terms of the psychological distance factor were examined by ANOVA, and the influence of the psychological distance factor on their Chinese as a second language acquisition was investigated by regression analysis.

The Application of Big Data Technology in Chinese Language Learning for Foreigners
Big Data Mining for Chinese Language International Education

Chinese language international education needs to enrich the Chinese learning space and optimize the Chinese learning system for Chinese language learners around the world. While providing learners with a rich Chinese language resource platform, it also allows all kinds of learning behavior data to be aggregated, creates an environment for data mining in the field of Chinese language international education, and provides more timely and accurate data information for the improvement of international Chinese language teaching activities by big data. Figure 1 shows the data mining process. The learning system in the Chinese learning platform should also be personalized and upgraded with big data technology. The learning system should design and plan learning contents and learning paths for learners according to their age and basic Chinese language level, develop different modules of Chinese language courses, and emphasize the guidance of learning paths for learners in terms of grammatical knowledge, cultural knowledge and language skills. We analyze and accurately assess the learning status, learning ability and learning progress from data feedback, and provide learning data analysis reports for Chinese learners and teachers respectively.

Figure 1.

General process of data mining

The comprehensive collection of education data is the foundation for building education big data. Like education data, data on Chinese language education is generated from various Chinese language teaching activities, mainly including online and offline Chinese language teaching. After clarifying the sources of data, the next stage is to collect the data on Chinese language teaching and learning. We can classify the types of data collected from Chinese language teaching into behavioral data and resource data. Behavioral data includes firstly the process behavioral data of learning, such as Chinese learners’ motivation, classroom interactions, Internet searches, progress and completion of online assignments, as well as quantifiable structural data, such as classroom participation, test scores, and proficiency levels. Next is the teaching behavior data of Chinese language teachers such as teaching language, teaching content, and teaching reciprocity. Resource data include Chinese learning websites, Chinese learning platforms and multimedia devices used for teaching as well as the resources they store and generate, such as teaching courseware, videos, pictures, texts, games, questions, test questions and so on.

In the practical activities of Chinese language education and teaching, Chinese language teachers and learners generate different types of data continuously by producing teaching and learning behaviors and using educational and teaching resources. Therefore, data mining technology should be used to collect, extract, process, and analyze the large amount of data generated in the process of Chinese language education and teaching, choose appropriate methods according to the different types of data, and mine out valuable information and laws, so as to improve the understanding of the learning process of the learners, improve the teaching strategies, and enhance the quality of learning.

Learning Analysis of Chinese International Education

The object of learning analytics is not only the massive learning behavior data generated by students in online learning, but also the teaching data generated by teachers in the teaching process and teaching management. Learning analytics uses data collection and data analysis technologies as tools to extract data with application value from the massive data related to “teaching and learning” through collection, processing, analysis, etc., to help teachers understand learners and their learning behaviors, carry out learning assessment, identify learning problems in time, and predict learning trends in the future. Learning assessment, timely detection of learning problems and prediction of future learning trends.

Figure 2 shows the model of international Chinese language education for foreigners based on big data. The learning styles of Chinese learners are categorized using the taxonomy of data mining. Suitable teaching forms, interactive methods and learning environments are selected for learners of different learning types, so as to carry out personalized Chinese teaching, improve the recommendation function of the Chinese learning platform and optimize the module design of the learning platform. At the same time, in the process of Chinese language teaching and independent learning, based on the learners’ learning behaviors such as movements, emotions, language used in answering questions and dialogues, recording notes and writing Chinese characters, as well as the number of times they log in to the platform, the frequency of browsing various types of learning modules, the length of their learning time, and the frequency of participating in the platform’s community discussions and asking questions, we can understand the learning forms and learning contents that are of interest to learners and classify the learning platform into different learning types. We can understand the learning forms and learning contents that learners are interested in, and make real-time dynamic additions to the learner image.

Figure 2.

Based on the large number of foreign Chinese international education model

In the dynamic learning process, according to the learners’ mastery of the learning content and the real-time data fed back during the learning process, targeted adjustments are made to the learning progress and learning sessions. Teachers and the system can use technical tools to track learners’ interactive performance, emotional changes, practicing speed and correctness in the classroom, providing timely and effective help for learners. Learners can also obtain visual data on their own learning progress and learning status through the platform, make timely adjustments or seek help from teachers, and formulate reasonable learning strategies based on the information.

Learning effect regression analysis model
Establishment of multiple linear regression equations

In this paper, regression analysis model is used to study the effect of foreign students’ learning Chinese as a foreign language based on big data technology. The main principles and steps of the regression analysis model are mainly continued below. Multiple linear regression analysis is to realize the minimum Q with the square of residuals. However, because of the relatively large number of variables associated with multiple linear regression analysis, it is also necessary to face more complex problems.

Assuming a linear correlation between p independent variable x1, x2, …, xy and a random variable y with a sample size of n and a i th observation of xi1, xi2, xi3 ⋯, xij. yi(i = 1, 2, …, n), its n th observation can be written in the following form: { y1=β0+β1x11+β2x12++βyx1y+ε1y2=β0+β1x21+β2x22++βyx2y+ε2yx=β0+β1xx1+β2xx2++βyxxy+εx where β0, β1, ⋯, βy is an unknown parameter, x1, x2, ⋯, xp is a p general variable that can be controlled and measured accurately, and ε1,ε2, ⋯ εx is a random error. The principle is the same as for the one-way linear regression analysis, and we can make the following assumptions: εi are random variables obeying the same normal distribution N (0, σ) and uncorrelated with each other [2122].

If we use a matrix to represent the system of equations (1), we have: Y=xβ+ε

Style: Y=(y1y2yz)X=(1x11x12x1y1x21x22x2y1xn1xn2xny)β=(β0β1βy)ε=(ε1ε2εn)

The key component of the multiple linear regression analysis is to obtain the valuation b of β to perform the construction of the multiple linear regression equation: y^=b0+b1x1+b2x2++bpxp

In turn, the multivariate linear model is described: y=β0+β1x1+β2x2++βyxy

Significance tests are performed on the regression coefficients and regression equations, the regression equations are controlled with the help of regression equations, and the system of linear equations is solved by using the elimination transform and the Gaussian elimination method in the estimation of β [2324].

The construction of the multiple linear regression equation is essentially a process of estimation around the multiple linear model (5) to achieve the acquisition of the estimation equation (4). Similar to the one-way linear regression analysis, the basic concept is to solve b0, b1, ⋯, bn in accordance with the principle of least squares in order to make the regression value y^i with all observations yi have the minimum value of Q with the residuals squared. Since the residual squared vs: Q=i=1x(yiy^i)2=i=1n[ yi(b0+b1xi1+b2xi2++byxiq) ]2

Is a non-negative quadratic of b0, b1, ⋯, bp, so it must have a minimum value.

In accordance with the principle of extreme values, b0, b1, ⋯, bp should be satisfied when Q is the extreme value: Qbj=0(j=0,1,2,,p)

By equation (7), which is satisfied: { i=1n[ yi(b0+b1xi1+b2xi2+bγxiy) ]=0i=1n[ yi(b0+b1xi1+b2xi2++bpxip) ]xi1=0i=1n[ yi(b0+b1xi1+b2xi2++bpxip) ]xij=0i=1n[ yi(b0+b1xi1+b2xi2++bγxip) ]xip=0

(Eq. (8) is a system of regular equations. It can be transformed to the following form: { nb0+(i=1nxi1)b1+(i=1nxi2)b2++(i=1xxiy)by=i=1nyi(i=1nxi1)b0+(i=1nxi12)b1+(i=1nxi1xi2)b2++(i=1nxi1xij)by=i=1xxi1yi(i=1xxiy)b0+(i=1nxiyxi1)b1+(i=1xxiyxi2)b2++(i=1nxij2)=i=1nxijyi

If A is used to represent the coefficient matrix of the above system of equations it can be found that A is a symmetric matrix. i.e: A=(ni=1nxi1i=1nxi2i=1nxipi=1nxi1i=1nxi12i=1nxi1xi2i=1nxi1xipi=1nxipi=1nxijxi1i=1nxipxi2i=1nxip2)=(1111x11x21x31xn1x12x22x32xx2x17x2px3pxxp)=(1x11x12x1y1x21x22x2y1x31x32x3y1x21xx2xxy)=XX

In Eq. X is the structure matrix and X′ is the transpose matrix of X.

The constant term at the right end of Eq. (9) can also be expressed as a matrix 4, i.e: D=(i=2nyii=1xxi1yii=1xxi2yii=1xxi7yi)=(1111x11x21x31x21x12x22x32xx2x17x27x3,x27)(y1y2y3y2)=XY

So equation (9) can be: Ab=D

Or: (XX)b=XY

If is A full rank (i.e., determinant |A| ≠ 0 of A), then A has inverse matrix A – 1, then the least squares of β can be estimated by equations (12) and (13) as: b=A1D=(XX)1XY

That is, the regression coefficient of the multiple linear regression equation.

For ease of computation, instead of taking (XX)−1 and then b, b is usually obtained by solving the system of linear equations (9). The first equation of (9) can be reduced to: b0=y¯b1x¯1b2x¯2byx¯y

Inside the style: { x¯j=1ni=1xxijj=1,2,,py¯=1ni=1nyi

Substituting Eq. (15) in the other equations contained in Eq. (9) yields: { L11b1+L12b2++L1yby=L1yL21b1+L22b2++L2yby=L2yLp1b1+Lp2b2++Lpybp=Lp

Among them: { Ljk=i=1n(xjix¯j)(xkix¯k)=i=1nxj¯xki1n(i=1nxj¯)(i=1nxki)Ljy=i=1n(xjix¯j)(yiy¯)=i=1nxjiyi1n(i=1nxij)(i=1nyi)

Using a matrix to represent the system of equations (18) formulas, it can be obtained: Lb=F

Among them: L=(L11L12L1PL21L22L2PLP1Lp2LpP)b=(b1b2by)F=(L1yL2yL3y)

So: b=L1F

Significance Test

Significance test of the regression equation

which represents the test hypothesis: H0:β1=β2==βP=0

If H0 in the formula is valid, it means that y will not vary with whatever variation exists in x1, x2, ⋯, xj–1, xj+1, ⋯, xp, and it is not appropriate to model the connection between y and the corresponding independent variable x1, x2, ⋯, xj–1, xj+1, ⋯, xj in this case. If H0 is a null category, it means that β1β2, ⋯ βp has more than one nonzero, and y will vary linearly with one or more changes in x1, x2, ⋯, xj–1, xj+1, ⋯, xp [25]. Therefore, this type of test belongs to the global perspective to see whether y and x1, x2, ⋯, xj–1, xj+1, ⋯, xp constitute a linear relationship.

As in the case of univariate linear regression, a companion statistic is constructed to carry out the test for H0, which allows for the decomposition of the total deviation squared as well as Lyy: SGeneral=Lyy=i=1n(yiy¯)2=i=1n(y^iy¯)2+i=1n(yiy^i)2=Sback +SRemnant 

Regression sum of squares: Sback =i=1x(y^iy¯)2=j=1ybjLjy

Residual sum of squares: SRemnant =i=1n(yiy^i)2=SGeneral Sback LypSback 

Can be derived from the previous equation: SRemnant σ2~x2(np1)$S_{\text {Remnant }} / \sigma^2 \sim x^2(n-p-1)$

In the case where H0 holds: Sback σ2~x2(p) $S_{\text {back }} / \sigma^2 \sim x^2(p)$

Sback and SRemmnt exist independently of each other.

So in case H0 holds: F=Sback pSRemnant (np1)~F(p,np1) $F=\frac{S_{\text {back }} / p}{S_{\text {Remnant }} /(n-p-1)} \sim F(p, n-p-1)$

For a given significance level α, the F-value parameter obtained in the analysis can be conformed: FFα(p,np1)

In the case of H0 as not valid, it is further determined that under α, there is a significant linear link between y and x1, x2, ⋯, xj–1, xj+1, ⋯, xp, which means that the regression equation belongs to the existence of significant. Conversely, then it is determined that the regression equation is categorized as non-significant.

Significance test of regression coefficients

In the process of solving the multiple regression problem, it is not enough to determine the significance of them. If the coefficients are found to be significant, then the hypothesis of β1 = β2 = ⋯ = βp = 0 is rejected, and it does not mean that all βi are non-zero, i.e., not all of the effects of the independent variable x1, x2, ⋯, xp on the dependent variable y are significant. If βi is zero, it means that the change in xj does not have a linear effect on y. xj is said to be a non-significant category. In order to ensure the quality of forecasting and control of y, it is necessary to test the coefficients, eliminate insignificant variables, and construct simpler and more accurate equations.

Testing the significance of a variable is equivalent to testing the corresponding hypothesis: H0j:βj=0

The following section discusses the way the test is performed.

Assumption y The p -dimensional equation about x1, x2, ⋯, xp belongs to: y^=b0+b1x1+b2x2+bpxp

With reference to the previous analysis, the total deviation sum of squares is able to be further split into: SGeneral =Sback +SDisabled 

If xj is treated with elimination, the p – 1 -element equation for y with respect to x1, x2, ⋯, xj–1, xj+1, ⋯, xp can also be obtained computationally, assuming that it belongs: y^=b0+b1x1++bj1xj1+bj+1xj+1++byxz

The corresponding total deviation sum of squares parameter belongs to: SΓ(j)=SBack(j)+SMutilation(j)

Because there is some reduction in the actual variables, the residual sum of squares will be elevated as a result: SMutilation(j)>SMutilation 

The difference is recorded as the corresponding partial regression sum of squares for variable xj, which is actually recorded as Qj: Qj=SMutilation(j)SMutilation

The foregoing analysis can be argued to conclude that in the case where the H0j assumptions are met: Fj=QjSso(np1)~F(1,np1) $F_j=\frac{Q_j}{S_{s o} /(n-p-1)} \sim F(1, n-p-1)$

For a given level of significance α, reject H0j in the case of Fj > Fa(1, np – 1) and conclude that there is a significant effect of variable xj on y.

The actual calculation process: Qj=bj2cjj

The cjj in the formula represents the j rd element of the diagonal side of matrix L – 1.

In the case of insignificant problems detected in the test, they are eliminated and the corresponding least squares estimates are calculated again to construct the matching regression equations. This requires a lot of analytical work. In fact, the old and new coefficients are related and the new regression coefficients can be easily obtained. The formula is: bi*=bicijcijbj(ij)

Formula bi* represents the new regression coefficients for xj of the p – 1 variables remaining after eliminating variable xj, and bj is the original regression coefficient for xj.

Because the regression coefficients are related to each other, in the case that n variables are not significant, it is not allowed to eliminate all of them, but only the insignificant variable with the lowest ratio of F can be eliminated, and then the matching regression equations are constructed, and then the test is conducted around the variables one by one.

Analysis of the effect of big data technology in learning Chinese as a foreign language
Analysis of Satisfaction with Chinese Learning
Statistical analysis of the overall satisfaction situation

In this paper, we take the Chinese language learning of 60 international students in College H as the research data, which used the ordinary Chinese language learning mode in the first semester and introduced the big data Chinese language learning mode in the second semester. Students’ satisfaction with big data Chinese learning is first analyzed, and the data are mainly collected through questionnaires, and the main results are as follows. In this paper, it is set that scores of 3 and above are recognized as biased toward satisfaction, then the average satisfaction rate is calculated based on the number of people surveyed whose average satisfaction score is 3 (basic satisfaction) and above, and it is derived that their average satisfaction rate is 85.2%, which indicates that more than 85% of the international students surveyed are basically satisfied and above with their satisfaction with their online Chinese learning. Table 1 describes the overall situation of satisfaction.

Satisfaction overall description

Analysis term Case number Minimum value Maximum value Mean value Standard deviation
Learner expectation 60 1 5 3.65 0.65
Perceived mass 60 1.5 5 3.98 0.55
Perceived value 60 1 5 3.56 0.68
Learner satisfaction 60 5 5 3.86 0.23
Continued learning will 60 1.2 5 3.62 1.05
Overall satisfaction 60 1.6 5 3.73 0.69

Meanwhile, combining with Table 1, we find that the mean value of international students’ overall satisfaction with Chinese learning is 3.73, the standard deviation is 0.69, the highest score is 5, and the lowest score is 1.6. The means of the five latent variables of satisfaction, in descending order, are Perceived Quality (3.98)>Learner Satisfaction (3.86)>Learner Expectation (3.65)>Willingness to Continuously Learn (3.62)>Perceived value (3.56). All of these dimensions are greater than 3, indicating a high level of satisfaction with the learning outcome.

Tests for differences in satisfaction

The differences in students’ satisfaction were then analyzed. The satisfaction means for female students on each of the analyzed items were greater than those for male students, and the smallest difference between male and female student means was in learner satisfaction, and the largest was in willingness to continue learning. Then, comparing the overall means among the variables horizontally, we found that the highest mean for both male and female learners was perceived quality, and both were higher than the expectations before participating in online learning, thus reflecting that both male and female students were more satisfied with the process of Chinese language learning.

Second, we used t-test (all known as independent samples t-test) to analyze the differences in the sample data, and Table 2 shows the difference analysis of gender in satisfaction. According to the results of the analysis, learners of different genders showed significant differences only in their willingness to continue learning (t=2.116, p=0.032<0.05), and girls (3.813)>boys (3.455). No significance (p>0.05) was shown for overall satisfaction and the other four items. Therefore, overall the effect of different genders on satisfaction is not significant, but different genders will show significant differences in satisfaction with Chinese learning in terms of willingness to continue learning.

Differences in the satisfaction of gender

Analysis term Gender Case number Mean value Standard deviation T P
Overall satisfaction Man 60 3.680 0.820 0.845 0.312
Female 60 3.781 0.679
Total 60 3.633 0.701
Learner expectation Man 60 3.576 0.670 0.518 0.512
Female 60 3.647 0.660
Perceived mass Man 60 3.912 0.793 0.442 0.64
Female 60 3.982 0.669
Perceived value Man 60 3.367 0.671 1.025 0.305
Female 60 3.536 0.752
Learners satisfaction Man 60 3.792 0.732 0.867 0.398
Female 60 3.811 0.707
Continued learning will Man 60 3.455 0.694 2.116 0.032*
Female 60 3.813 0.823
Analysis of achievements

In order to investigate the impact of this learning model on students’ performance, this paper sets up an experimental group and a control group. In the first semester both groups of students engaged in traditional learning, and in the second semester after the introduction of big data technology, students in the experimental class engaged in innovative learning, and students in the control class still engaged in traditional learning methods. Then we analyze the changes in the mathematics scores of the two groups of students in the three examinations: entrance examination, the final examination of the first semester and the final examination of the second semester.Figure 3 shows the mathematics scores of the experimental students in the entrance examination and the final examination of the first semester respectively.Figure 4 shows the mathematics scores of the experimental students in the final examination of the second semester.

Figure 3.

Shows the first test results

Figure 4.

Shows the comparison of the second test results

From the figure, it can be seen that in the results of the entrance examination, except for some low scores (20-40 segments), the overall performance of the students is close to a normal distribution. In the final exam of the first semester, the overall distribution of students’ scores is not much different from that of the entrance exam, and the students’ scores are mostly distributed in the 60-80 bands. In the final exam of the second semester, the number of students with high scores increased significantly, and they were mostly distributed in the 80-100 segments, while the number of students with low scores decreased significantly.By comparing the changes in the students’ scores in the three exams, it can be found that the students’ overall academic performance in this class has been significantly improved in the process of carrying out the big-data Chinese language learning.

Analysis of the effect of Chinese as a foreign language acquisition
Analysis of differences in psychological distance

Then it compares the change in psychological distance of the students in both classes after the resultant two semesters of study. The following is a one-way ANOVA on the psychological distance factor for the two groups of students in the experimental and control classes. Table 3 shows the basic information about the psychological distance of the subjects and its factors.

Psychological distance and various factors analysis

Factor Group Number Mean Standard deviation Min Max
Language shock Control group 60 3.19 0.55 0.97 4.45
Experimental group 60 3.86 0.51 1.95 5
Cultural shock Control group 60 3.58 0.52 0.95 4
Experimental group 60 3.96 0.51 2.11 5
Instrumental learning motivation Control group 60 3.99 0.56 1.02 4.8
Experimental group 60 4.55 0.49 1.85 5
Fusion learning motivation Control group 60 3.85 0.54 1.07 4.55
Experimental group 60 4.22 0.49 2.16 5
Language boundary permeability Control group 60 2.89 0.53 0.92 4.75
Experimental group 60 3.86 0.49 1.93 5
Psychological distance score Control group 60 3.62 0.51 0.98 54.5
Experimental group 60 4.01 0.49 2.01 5

Firstly, from the perspective of the overall psychological distance score, the average score of the experimental class (4.01) was higher than that of the students in the control class (3.62), indicating that the actual “psychological distance” of the students in the experimental class was smaller than that of the international students in the control class. Secondly, specifically, the scores of the experimental class were higher than those of the control class in various indicators, and the order of the difference was “language boundary permeability” (0.97), “language shock” (0.67), “instrumental learning motivation” (0.56), “cultural shock” (0.38), and “instrumental learning motivation” (0.37).

Table 4 shows the social distance of the participants and the analysis of variance of each factor. The significance level of the “overall psychological distance score” was 0.018, less than 0.05, showing significance. Therefore, we believe that there is a significant difference in the “overall psychological distance score” between the students in the control class and the experimental class in a statistical sense, and the “psychological distance” of the students in the experimental class is smaller than that of the international students in the control class.

Analyzed by the social distance and variance of the participants

Dimension Sum of squares df Mean square F Sig
Language shock Intergroup 0.45 1 0.67 0.63 0.453
Within group 54.56 76 0.78
Total amount 55.89 77
Cultural shock Intergroup 0.45 1 0.43 1.68 0.206
Within group 20.34 76 0.27
Total amount 21.67 77
Instrumental learning motivation Intergroup 0.37 1 0.38 0.78 0.432
Within group 38.99 76 0.5
Total amount 39.86 77
Fusion learning motivation Intergroup 6.68 1 6.88 10.566 0.003
Within group 51.69 76 0.67
Total amount 59.76 77
Language boundary permeability Intergroup 22.33 1 22.16 21.388 0.000
Within group 81.65 76 1.06
Total amount 103.89 77
Psychological distance score Intergroup 0.87 1 0.89 5.797 0.018
Within group 12.05 76 0.18
Total amount 12.93 77

The significance levels of “language shock”, “cultural shock” and “instrumental learning motivation” were 0.453, 0.206 and 0.432, respectively, which were all greater than 0.05 and were not significant. Therefore, there was no significant difference between the control class and the experimental class in the factors of “language shock”, “cultural shock” and “instrumental learning motivation”.

The significance levels of “convergence learning motivation” and “language boundary permeability” were 0.003 and 0.000, respectively, which were less than 0.05, which was significant. Therefore, there were significant differences between the control class and the experimental class in terms of “integrated learning motivation” and “language boundary permeability”. The scores of the students in the experimental class were higher than those in the control group in terms of “integrated learning motivation” and “permeability of language boundaries”.

In conclusion, there were significant differences between the students in the control class and the experimental class in the three factors of “overall psychological distance”, “integrated learning motivation” and “permeability of language boundaries”, and the scores of the students in the experimental class were higher than those in the control class.

Correlation Analysis of Psychological Distance Factors and Chinese Language Acquisition Effects

In this paper, the psychological distance of the students in the control class and the experimental class was taken as the independent variable, and their Chinese proficiency was used as the dependent variable for regression analysis, and Table 5 showed the psychological distance factors and Chinese learning regression analysis. The standardized regression coefficient of “psychological distance score” as an independent variable is positive, which means that there is a positive correlation between them, that is, the higher the psychological distance score of Chinese learners, the higher the ability to use Chinese, and the better the effect of Chinese acquisition. The significance of the regression coefficient between the psychological distance factor and the Chinese learning effect was 0.018, which was less than 0.05, which was significant, so we believe that there was a statistically significant positive correlation between the psychological distance and Chinese language learning ability of the two groups.

Psychological distance factors and the regression analysis of Chinese learning

Factor Normalized regression coefficient Significance
Language shock 0.086 0.445
Cultural shock 0.167 0.156
Instrumental learning motivation 0.093 0.412
Fusion learning motivation -0.365 0.003
Language boundary permeability 0.387 0.002

The regression coefficients of language shock, culture shock, instrumental learning motivation and language boundary permeability as independent variables are positive, with regression coefficients of 0.086, 0.167, 0.093, 0.387 respectively, which represent positive correlations among them, i.e. the lower the degree of language shock and culture shock of Chinese learners, the better their Chinese acquisition effect is, the more positive the instrumental learning motivation is, the better the Chinese acquisition effect is, and the more open Chinese learners’ attitude towards other languages, the better their Chinese acquisition effect is. The regression coefficient of “integrative learning motivation” as the independent variable is negative, and the regression coefficient is -0.365, which means that they are negatively correlated, i.e., the stronger the integrative learning motivation of Chinese learners does not mean the better the Chinese language acquisition effect of Chinese learners. In addition, the significance of the regression coefficients of integrative learning motivation and linguistic boundary permeability are 0.003 and 0.002 respectively, which are less than 0.05, and are relatively significant, and there is a statistically significant positive correlation between the factor of linguistic boundary permeability and Chinese language ability of the two groups of students, i.e., the students in the experimental group have better Chinese learning effect than those in the experimental group. The Chinese learning effect of the experimental group is better than that of the control group, and the big data Chinese learning technology proposed in this paper is helpful for learning Chinese as a foreign language.

Conclusion

This paper constructs a big data intelligent learning model of Chinese as a foreign language and investigates the impact of this big data technology on the Chinese learning effect of international students through a regression analysis model. The main conclusions are shown below:

More than 85% of international students’ satisfaction with this Chinese language learning reached basic satisfaction and above, which indicates that people are highly satisfied with this learning effect.

Before and after the study, the students’ foreign Chinese learning achievement has changed significantly, the students’ Chinese achievement before the study is mostly distributed in 60-80 segments, and after one semester of innovative Chinese learning the students’ Chinese achievement is mostly distributed in more than 80-100 segments, which indicates that the big data Chinese learning mode has a significant promotion effect on the learning effect of international students.

After the innovative learning, the average score of “psychological distance” of students in the experimental class (4.01) is higher than that of students in the control class (3.62), indicating that the actual “psychological distance” of the experimental class is smaller than that of the control class. There is a significant correlation between the psychological distance factor and the Chinese language acquisition effect of the students, and the results show that the Chinese language learning effect of the experimental group is better than that of the control group, i.e., the big data Chinese language learning model is helpful for international students to learn Chinese as a foreign language.

Lingua:
Inglese
Frequenza di pubblicazione:
1 volte all'anno
Argomenti della rivista:
Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro