A Study on the Optimisation of Personalised English Learning Paths Based on Big Data Analysis and Its Impact on Teaching Efficiency

Under the background of informationization and the digital era, English education in colleges and universities is facing many challenges and opportunities. The traditional teaching mode has made it difficult to meet the diverse and individualized learning needs of students, and there is an urgent need for innovation and reform [1-2]. Big data technology is an important part of modern information technology, and its application in the field of education provides new solutions for English education in colleges and universities [3-4]. For this reason, there is an urgent need to explore the teaching mode of English education in colleges and universities based on big data technology, with a view to improving the efficiency and quality of teaching and better meeting the needs of students and society.

The application of big data technology in the field of education can lead to the innovation of teaching mode, especially to realize personalized teaching and improve teaching efficiency [5]. Most colleges and universities have fully realized that big data, as an important personalized teaching resource, can provide colleges and universities with personalized talent demand direction and specific data prediction and analysis, formulate personalized teaching practice strategies for students, develop personalized teaching resources, dock students’ diversified and personalized learning needs, and realize personalized teaching of English in colleges and universities [6-9]. Big data analysis can help teachers obtain valuable educational information, and teachers can use big data to conduct deeper mining and interpretation of data to provide necessary guidance and assistance for the smooth implementation of various educational teaching practice activities [10-12]. With the continuous innovation of technology, the trend of integration of multiple fields of big data is becoming more and more significant, and big data, as a fusion of various information technologies, will become an important carrier for the hierarchization of English personalized teaching [13-14].

Literature [15] examines the teaching effect of the campus network personalized English teaching platform based on big data in carrying out practical English teaching activities and finds that the platform enhances the relevance and initiative of students’ learning and better improves students’ comprehensive English ability. Literature [16] proposes a personalized English teaching system that integrates big data and artificial intelligence, which uses big data technology to collect data on students’ language ability, learning styles, and desired goals and then creates a fast-response intelligent learning environment through artificial intelligence algorithms to create a personalized learning path for students. Literature [17] designed a Hadoop-based statistical analysis platform for educational data, which describes the business needs of administrators and teachers with case diagrams and use case tables, and also records, accumulates, counts, and analyzes students’ learning behaviors in order to achieve efficient educational and teaching activities. Literature [18] will use big data technology to build an English teaching ecosystem characterized by information sharing, high-quality teaching and personalized learning to help students update their concepts of English learning, stimulate their interest and initiative, and thus improve their English learning outcomes and English application skills. Literature [19] constructed an English online teaching platform based on machine learning algorithms and cloud computing technology and used network simulation technology to meet the needs of multi-user concurrent operation, helping the development of English personalized teaching in the cloud education environment. Literature [20] integrates computer technology to design an English database system with grammar checking and diagnosis function, which helps students to improve their grammar testing and application ability while stimulating their interest in English learning and has certain practical value in personalized teaching.

Keywords: collaborative filtering; neural network; path optimization; English teaching. This paper constructs a personalized model for optimizing English learning paths based on neurocognitive collaborative filtering. Based on neurocognitive collaborative filtering, cognitive level diagnosis, and the learner’s personalized goal parameters, we can generate a personalized learning path for the learner. The neurocognitive collaborative filtering model is associated with the cognitive level diagnostic model, and the output of the neurocognitive collaborative filtering model is input into the cognitive level diagnostic model as one of the input features together with the learner’s historical behavioral data. The learning path recommendation based on learner profiles has finally been realized. After the model construction is completed, performance tests are conducted on the dataset and applied to the English classroom of a school to explore its impact on teaching efficiency.

2

Personalized English Learning Path Optimization

2.1

Personalized education based on big data

The changes brought about by big data in education are mainly reflected in three aspects: (1) the realization of the group to the individual, making personalized education possible; (2) the implementation of teaching strategies to realize the “de-empirical”, insisting on starting from the real and objective learning data of the students, on finding the key factors affecting the quality of learning; (3) the realization of the individual learner’s real learning information is fully displayed, so that the learning data information from suspicion to certainty, so that the researcher is truly informed of the real learning situation of the learners. (3) to realize the comprehensive presentation of the real learning information of individual learners, so that the learning data information from suspicion to certainty, so that the researcher is really informed of the real learning situation of the learners. The development of big data provides objective support for comprehensively recording, tracking, grasping, and visualizing the different characteristics, needs, learning abilities, and learning behaviors of different learners. In particular, the online teaching practice during the epidemic helped teachers and students accumulate rich and valuable online learning experiences, and at the same time, gradually made them realize that big data in online education with unique characteristics will be the cornerstone for the future development of online education.

Changes in teaching and learning styles in the teaching and learning environment have become a major trend. The rapid development of information technology and big data in education has made the blueprint of personalized teaching and learning a reality with the help of cutting-edge technological tools to track the whole process of student learning. Tracking, describing and analyzing learners’ learning process data is not only for grading or summative assessment but also for assisting teachers in analyzing learners’ whole learning process data and tracking learners’ whole learning status efficiently, conveniently and in real time. Cutting-edge technology support and education big data play an important role in supporting personalized education and teaching supported by technology. At present, many researchers are committed to the development of online learning systems with personalized learning mechanisms, trying to provide learners with appropriate learning content as a learning scaffold to assist online learners in learning efficiently. Among them, personalized learning paths, as one of the important topics in personalized learning, have been. However, in more existing research on personalized learning path planning systems, there is still the phenomenon of ignoring whether the learner’s ability is adapted to the recommended paths, which also leads to poor results of the path planning system. Therefore, personalized learning path planning based on the diagnosis of learners’ learning ability to help personalized learning path guidance and learners’ knowledge mastery state is one of the educational technology problems that need to be solved in the field of personalized learning.

In this paper, we will combine neural network technology and collaborative filtering algorithm to construct a personalized learning path generation model based on neurocognitive collaborative filtering to realize the optimization of English learning path recommendation based on big data.

2.2

Hybrid Collaborative Filtering Recommendation Algorithm

Hybrid collaborative filtering recommendation is a common recommendation algorithm nowadays. Firstly, the original rating matrix is optimized, and then, based on the optimization, user-based collaborative filtering and item-based collaborative filtering are fused. 1)

User-based collaborative filtering (UCF)

UCF considers that if two users like several different items at the same time, they are considered to have a high degree of similarity between them. The user’s rating information on the item directly feeds back the degree of the user’s favorite. Therefore, UCF uses the rating data to calculate users’ similarity and believes that the higher the similarity between two users, the more likely they will like the same item. The algorithm is based on the user’s past behavior, is highly interpretable, and is easy to discover new user interests. The more users there are, the richer the recommendation results are, and the better the overall performance of the algorithm is; it is also able to deal with semi-structured and unstructured complex data and is applicable to multiple domains. However, the algorithm completely relies on the user’s rating data of the item and cannot accurately recommend it to new users who have not generated rating behavior. At the same time, the algorithm needs to constantly update the user-item scoring matrix to ensure the accuracy of the recommendation since the user’s interests may change over time; moreover, the algorithm suffers from a large computational effort when the number of users is much larger than the number of items.

2)

Item-based collaborative filtering (ICF)

Unlike the user-based nearest neighbor recommendation approach, ICF considers two items as similar if multiple different users like them at the same time and thus can recommend another item to a target user who currently likes only one of the items. The advantage of this algorithm, which calculates the similarity between items, is that it can cope well with the recommendation situation where the number of items is much larger than the number of users. However, due to the high similarity between popular items and other items, the algorithm is not conducive to the mining of long-tail items, and the recommendation list often contains a large number of popular items, which makes the recommendation inaccurate; at the same time, it is not possible to compute the similarity between the newly added items and other items, which means that there is a cold-start problem for the items.

3)

Mixed Recommendations

From the above, it can be seen that UCF and ICF have their advantages and disadvantages, and there is no single algorithm that is suited to all scenarios. At the same time, in the actual recommendation process, a single algorithm can not meet the complex requirements of the user or the item, and there are greater limitations in solving problems such as data sparsity and cold start. In order to compensate for the shortcomings of a single recommendation algorithm, in the practical application of learning path personalized recommendation, it is generally combined with recommendation algorithms, so as to improve the overall recommendation performance of the system.

2.2.1

Matrix optimization

The sparsity measure is shown in equation (1): (1) $S p a r s i t y = 1 - \frac{r}{m \times n}$ $$Sparsity = 1 - \frac{r}{{m \times n}}$$

Equation (2) was used to calculate the user’s predictive score:: (2) $P (u, i) = \frac{\sum_{j e L} s i m_{c} (i, j) \times R (u, j)}{| L |}$ $$P(u,i) = \frac{{\sum\limits_{jeL} s i{m_c}(i,j) \times R(u,j)}}{{|L|}}$$

2.2.2

User-based collaborative filtering algorithms

The core of the user-based collaborative filtering algorithm (UCF) is to find the set of similar neighboring users by calculating the user similarity and finding the set of similar neighboring users according to the similarity level [21]. The specific implementation steps are as follows: 1)

Construct a user similarity matrix. The Pearson correlation coefficient is utilized to calculate the similarity between users and users, with full consideration of the user’s scoring habits, and the calculation formula is: (3) $s i m_{u s e r} (u, v) = \frac{\sum_{i \in B_{v v}} (R_{u, i} - \bar{R_{u}}) \cdot (R_{v, i} - \bar{R_{v}})}{\sqrt{\sum_{i \in B_{v v}} {(R_{u, i} - \bar{R_{u}})}^{2}} \sqrt{\sum_{i \in B_{v v}} {(R_{v, i} - \bar{R_{v}})}^{2}}}$ $$si{m_{user}}(u,v) = \frac{{\sum\limits_{i \in {B_{vv}}} {({R_{u,i}} - \overline {{R_u}} )} \cdot ({R_{v,i}} - \overline {{R_v}} )}}{{\sqrt {\sum\limits_{i \in {B_{vv}}} {{{({R_{u,i}} - \overline {{R_u}} )}^2}} } \sqrt {\sum\limits_{i \in {B_{vv}}} {{{({R_{v,i}} - \overline {{R_v}} )}^2}} } }}$$

2)

Establish the set of similar near-neighbors of the target user. The similarity between users is ranked in descending order, and the users with the highest similarity to the target user are regarded as the set of near-neighbor users of the target user.

3)

Predicting scores. Predicted using equation (4): (4) $P_{u s e r} (u, i) = \bar{R_{u}} + \frac{\sum_{v \in K_{1}} s i m_{u s e r} (u, v) \times | R_{v, i} - \bar{R_{v}} |}{\sum_{v \in K_{1}} s i m_{u s e r} (u, v)}$ $${P_{user}}(u,i) = \overline {{R_u}} + \frac{{\sum\limits_{v \in {K_1}} s i{m_{user}}(u,v) \times |{R_{v,i}} - \overline {{R_v}} |}}{{\sum\limits_{v \in {K_1}} s i{m_{user}}(u,v)}}$$

2.2.3

Item-based collaborative filtering algorithms

The steps to implement the item-based collaborative filtering algorithm (ICF) are as follows: 1)

Construct the item similarity matrix. In this paper, we use the optimized scoring matrix to calculate the cosine similarity between items and items, which is calculated as: (5) $s i m_{b o o k} (i, j) = \frac{\sum_{u = 1}^{m} R_{u, i} \times R_{u, j}}{\sqrt{\sum_{u = 1}^{m} {(R_{u, i})}^{2}} \times \sqrt{\sum_{u = 1}^{m} {(R_{u, j})}^{2}}}$ $$si{m_{book}}(i,j) = \frac{{\sum\limits_{u = 1}^m {{R_{u,i}}} \times {R_{u,j}}}}{{\sqrt {\sum\limits_{u = 1}^m {{{({R_{u,i}})}^2}} } \times \sqrt {\sum\limits_{u = 1}^m {{{({R_{u,j}})}^2}} } }}$$

2)

Establish the set of nearest neighbors of the target items. Arrange the similarity between items in descending order and select the set of nearest neighbor items with the highest similarity to the target item.

3)

Predict the score. Predict according to equation (6): (6) $P_{b o o k} (u, i) = \frac{\sum_{j \in K_{2}} s i m_{b o o k} (i, j) \times R_{u, j}}{\sum_{j \in K_{2}} s i m_{b o o k} (i, j)}$ $${P_{book}}(u,i) = \frac{{\sum\limits_{j \in {K_2}} s i{m_{book}}(i,j) \times {R_{u,j}}}}{{\sum\limits_{j \in {K_2}} s i{m_{book}}(i,j)}}$$

2.2.4

Hybrid collaborative filtering recommendation algorithms

No recommendation algorithm can be applied to all datasets. UCF, which calculates user similarity to make recommendations, has higher accuracy in datasets where the number of users is larger than the number of items; on the contrary, ICF is more suitable for datasets where the number of items is larger than the number of users. The more common learning path optimization generally adopts hybrid collaborative filtering recommendation (UCF-ICF), which fuses the predictive scores of user-based collaborative filtering algorithm and item-based collaborative filtering algorithm by introducing the weighting coefficients β and then unfolds the final recommendation based on the fused predictive scores, and the formula computes the fused predictive scores: (7) $P (u, i) = β P_{u s e r} (u, i) + (1 - β) P_{b o o k} (u, i)$ $$P(u,i) = \beta {P_{user}}(u,i) + (1 - \beta ){P_{book}}(u,i)$$

2.3

Learning paths based on neurocognitive collaborative filtering

2.3.1

Learner Profile Modeling

By using existing learner data and cognitive characteristics, it aids in inferring the learning preferences and needs of new learners, thus enabling the generation of personalized learning paths and improving the applicability and universality of the system. The schematic diagram of learner portrait modeling is shown in Figure 1.

1)

Quantitative representation based on Felder-Silverman learning styles

Learning style refers to the preferences, habits and tendencies shown by individuals in the learning process, as well as their attitudes and behavioral approaches to learning activities. Learning styles cover the characteristics of how individuals approach learning tasks, how they acquire and process information, and how they choose and utilize learning environments.

The different learning styles are specifically active, perceptive, visual and sequential. The behavioral patterns to the right of the learning styles are contemplative, intuitive, verbal, and integrative. For each learner, the sequence of learning behaviors can be represented as x₁, x₂, ⋯, x_n, where x_i represents the i th learning behavior pattern. Learning styles can be represented as S₁, S₂, ⋯, S_n, where S_i represents the i th learning behavior pattern. The quantitative Kan values of the patterns can be divided into three levels, which are expressed as low L, medium M, and high H. According to the above definition, the three levels are satisfied as shown in Equation (8): (8) $\max S_{i} (f {(x)}_{i} | x_{i} (L), x_{i} (M), x_{i} (H)) = \frac{\frac{1}{\sqrt{2 π σ}} e^{- \frac{{(a - μ)}^{2}}{2 σ^{2}}}}{\sum_{i = 1}^{n} \int_{L}^{H} f {(x)}_{i} d x}$ $$\max {S_i}(f{(x)_i}|{x_i}(L),{x_i}(M),{x_i}(H)) = \frac{{\frac{1}{{\sqrt {2\pi \sigma } }}{e^{ - \frac{{{{(a - \mu )}^2}}}{{2{\sigma ^2}}}}}}}{{\sum\limits_{i = 1}^n {\int_L^H f } {{(x)}_i}\:dx}}$$

2)

Learner feature clustering

The analysis of learner data using clustering algorithms is an important technique in intelligent education. This method collects data in multiple dimensions, such as learners’ basic attributes, learning style characteristics, and learning outcomes, and processes these data using clustering algorithms to achieve effective classification of learners and mining of hidden information.

First, for the basic information provided by learners, this paper assigns a unique numbered ID to each learner as an index to construct the learner set , where Stu = {ID₁, ID₂, ⋯, ID_n} represents the total number of learners. The numbered ID of each learner can be regarded as a vector containing three dimensional profiling portraits, denoted as $I D_{i} = [\begin{matrix} α^{⊤}, β^{⊤}, γ^{⊤} \end{matrix}]$ $$I{D_i} = \left[ {\begin{array}{*{20}{c}} {{\alpha ^ \top },{\beta ^ \top },{\gamma ^ \top }} \end{array}} \right]$$ , where α^T, β^T, γ^T represents the learner’s basic attributes, the learning process, and the learning outcome, respectively. This representation satisfies the conditions shown in Equation (9) below: (9) ${\begin{array}{l} α = {[I D, A g e, S e x, E d u]}^{⊤} \\ β = {[λ_{1}, λ_{2}]}^{⊤} \\ γ = {[R e c, R e s, P r o, T a r]}^{⊤} \end{array}$ $$\left\{ {\begin{array}{*{20}{l}} {\alpha = {{[ID,Age,Sex,Edu]}^ \top }} \\ {\beta = {{[{\lambda _1},{\lambda _2}]}^ \top }} \\ {\gamma = {{[Rec,Res,Pro,Tar]}^ \top }} \end{array}} \right.$$

Where the column vector α, γ represents the eigenvalues in the dimensions of basic attributes and learning outcomes, respectively, and β is composed of the eigenvalues λ₁, λ₂ of the two-dimensional matrix consisting of its matrix learning features β₁ and learning overhead β₂, and satisfies Equation (10): (10) ${\begin{array}{l} β_{1} = {[A r e, S t y]}^{⊤} \\ β_{2} = {[E x p, T i m]}^{⊤} \end{array}$ $$\left\{ {\begin{array}{*{20}{l}} {{\beta _1} = {{\left[ {Are,Sty} \right]}^ \top }} \\ {{\beta _2} = {{\left[ {Exp,Tim} \right]}^ \top }} \end{array}} \right.$$

Then, in this paper, take the use of the K-means algorithm sequentially for each ID in the learner collection corresponding to the eigenvalues to find its spatial distance, calculate its maximum similarity, and respectively, the basic vectors in the above as the center vector to obtain the center clusters. Such as age. Age as the clustering center vector in the learner collection library Stu to get the cluster vector can be expressed as equation (11): (11) $N_{A g e} = {\begin{matrix} I D_{1} (α_{1}^{⊤}, β_{1}^{⊤}, γ_{1}^{⊤}), I D_{2} (α_{2}^{⊤}, β_{2}^{⊤}, γ_{2}^{⊤}), \dots, I D_{n} \end{matrix}}$ $${N_{Age}} = \left\{ {\begin{array}{*{20}{c}} {I{D_1}\left( {\alpha _1^ \top ,\beta _1^ \top ,\gamma _1^ \top } \right),I{D_2}\left( {\alpha _2^ \top ,\beta _2^ \top ,\gamma _2^ \top } \right), \cdots ,I{D_n}} \end{array}} \right\}$$

In the K-means algorithm, the distance between two eigenvalues obtained at the end is usually calculated using the Euclidean distance. Euclidean distance is one of the most commonly used distance measures for continuous eigenvalues in the feature space. The Euclidean distance d between two eigenvalues x and y can be expressed as equation (12): (12) $d (x, y) = \sqrt{\sum_{i = 1}^{n} {(x_{i} - y_{i})}^{2}}$ $$d(x,y) = \sqrt {\sum\limits_{i = 1}^n {{{\left( {{x_i} - {y_i}} \right)}^2}} }$$

2.3.2

Neural Collaborative Filtering Layer

The core idea of collaborative filtering is to discover the correlation between learners and knowledge points based on learner behavior data, such as learners’ behaviors of studying, reviewing, and quizzing on knowledge points, in order to make personalized recommendations. The key to collaborative filtering is to use user behavior data to discover the correlation between users and items without explicitly characterizing users or items beforehand. 1)

Similarity calculation

The first layer represents three different learners, the second layer is five different unlearned knowledge points, the third layer is the knowledge point fusion coding layer, and the fourth layer is the learner relationship embedding layer [22]. In the knowledge point fusion coding layer, this paper calculates the fusion coding of knowledge points by weighted summation of all learner embedding vectors and knowledge point embedding vectors, where the weighted summation is used in order to take into account the different degrees of importance of different learners for the same knowledge point. Specifically, Equation (13) can be used to calculate the fusion coding of knowledge points: (13) $F_{j} = \sum_{i = 1}^{m} α_{i j} \cdot σ (U_{i} \cdot P_{j}^{⊤})$ $${F_j} = \sum\limits_{i = 1}^m {{\alpha _{ij}}} \cdot \sigma ({U_i} \cdot P_j^ \top )$$

Where F_j denotes the fusion coding of the knowledge point j , α_ij is the weight of the learner i on the knowledge point j , U_i is the embedding vector of the learner i and P_j is the embedding vector of the knowledge point j . The formula represents the fusion of the information of knowledge points through the learner’s embedding and the corresponding weights to get a comprehensive fusion coding of knowledge points. The formula $σ (x) = \frac{1}{1 + e^{- x}}$ $$\sigma (x) = \frac{1}{{1 + {e^{ - x}}}}$$ represents the fusion of the information of knowledge points through the learner’s embedding and the corresponding weights to get a comprehensive fusion coding of knowledge points.

After fusing the knowledge points in the fourth layer to encode the learner vectors, this paper uses a similarity matrix to represent the similarity between the learners and adjusts the encoding of the target learner by weighted summing the encodings of the similar learners, where S_ij denotes the similarity between Learner i and Learner j. Then, this similarity matrix is used to compute the new encoding of the learner vectors as in (14): (14) $U_{i}^{'} = \sum_{j = 1}^{m} S_{i j} \cdot U_{j} + \sum_{k = 1}^{n} γ_{i k} \cdot F_{k}$ $${U'_i} = \sum\limits_{j = 1}^m {{S_{ij}}} \cdot {U_j} + \sum\limits_{k = 1}^n {{\gamma _{ik}}} \cdot {F_k}$$ 2)

Generation of Predictive Rating Matrix

In the field of education and teaching, collaborative filtering-based methods can be applied to predict learners’ ratings of unevaluated knowledge points in order to generate a personalized list of knowledge point recommendations.

Suppose, the sequence of learner feature vectors can be denoted as U₁, U₂, ⋯, U_N and the sequence of knowledge point feature vectors can be denoted as P₁, P₂, ⋯, P_N. Then learner i rating of knowledge point j can be predicted by the similarity between learner i feature vector U_i and knowledge point j feature vector P_j. This can be represented by equation (15): (15) ${\hat{r}}_{i j} = (\sum_{k = 1}^{N} U_{i k} \cdot P_{j k}^{⊤} + b) \cdot E$ $${\hat r_{ij}} = \left( {\sum\limits_{k = 1}^N {{U_{ik}}} \cdot P_{jk}^ \top + b} \right) \cdot E$$

Where b is the bias term and E is the unit matrix that recommends only the most suitable knowledge points to the learner each time.

2.3.3

Cognitive level diagnostic layer

The cognitive level diagnostic layer focuses on analyzing and decomposing the eigenvalues of learners and knowledge points to assess learners’ mastery of specific aspects of different knowledge points [23]. This part is divided into two main parts: static parameter estimation and dynamic data analysis.

The static parameter estimation component allows for a fine-grained assessment of the student’s level of competence in each skill and permits the consideration of interactions between the requisites. On the other hand, the dynamic data analysis part compensates for the lack of consideration of the dynamics of learning by using the feature interaction matrix as parameter input and the learner’s personalized learning path as the output of the model training. 1)

Static parameter estimation

In the static parameter estimation part, U_ik denotes the learner’s i response to the test k, and P_kj denotes the test k ’s examination of the knowledge point j, so the learner i ’s mastery of the knowledge point j can be expressed as a score matrix y_ij , as in Equation (16):

(16)

y_{i j} = [\begin{matrix} S_{11} & S_{12} & \dots & S_{1 j} \\ S_{21} & S_{22} & \dots & S_{2 j} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ S_{i 1} & S_{i 2} & \dots & S_{i j} \end{matrix}]

$${y_{ij}} = \left[ {\begin{array}{*{20}{c}} {{S_{11}}}&{{S_{12}}}& \cdots &{{S_{1j}}} \\ {{S_{21}}}&{{S_{22}}}& \cdots &{{S_{2j}}} \\ \vdots & \vdots & \ddots & \vdots \\ {{S_{i1}}}&{{S_{i2}}}& \cdots &{{S_{ij}}} \end{array}} \right]$$

Knowledge point or not. Assuming that the topic j belongs to the knowledge point , then the elements in the score matrix y_ij can be defined as a binary variable indicating whether the student S_ij has mastered the knowledge point j or not, as in Equation (17): (17) $S_{i j} = {\begin{array}{l} 1, & i f \sum_{k} U_{i k} \times P_{k j} > b_{i j} \\ 0, & o t h e r w i s e \end{array}$ $${S_{ij}} = \left\{ {\begin{array}{*{20}{l}} {1,}&{if\:\sum\limits_k {{U_{ik}}} \times {P_{kj}} > {b_{ij}}} \\ {0,}&{otherwise} \end{array}} \right.$$

The static parameter estimation feature parameter b_ij is introduced here as the a priori information for the dynamic data analysis, which is defined as Equation (18): (18) $b_{i j} = b_{i} + b_{j} = \frac{1}{J} \times \sum_{j = 1}^{J} S_{i j} + \frac{1}{I} \times \sum_{i = 1}^{I} S_{i j}$ $${b_{ij}} = {b_i} + {b_j} = \frac{1}{J} \times \sum\limits_{j = 1}^J {{S_{ij}}} + \frac{1}{I} \times \sum\limits_{i = 1}^I {{S_{ij}}}$$

Therefore, it is only necessary to filter out the first q knowledge points where the proportion of knowledge points mastered by that learner is less than the proportion of knowledge points mastered by all learners, i.e., the set of explicitly weak knowledge points of that learner, which can be expressed as Equation (19): (19) $X_{i j} (i) = {j : S_{i j} = 1, r a n k (\frac{b_{i}}{\sum_{i = 1}^{I} S_{i j}}) < \frac{1}{q} \cdot \sum_{i = 1}^{q} S_{i j}}$ $${X_{ij}}\left( i \right) = \left\{ {j:{S_{ij}} = 1,rank\left( {\frac{{{b_i}}}{{\sum\limits_{i = 1}^I {{S_{ij}}} }}} \right) < \frac{1}{q} \cdot \sum\limits_{i = 1}^q {{S_{ij}}} } \right\}$$ 2)

Dynamic parameter analysis

In the process of dynamic learning data analysis, the learner-knowledge point feature vector interaction matrix output from the neural collaborative filtering layer is weighted and averaged for each of its components in turn to obtain the interaction matrix, which can be expressed as Equation (20):

(20)

R_{i j} = σ (w_{1} \cdot \frac{1}{n} \sum_{i = 1}^{n} (β_{1, i} \cdot d_{i j}) + w_{2} \cdot \frac{1}{n} \sum_{i = 1}^{n} (β_{2, i} \cdot c_{i j}))

$${R_{ij}} = \sigma \left( {{w_1} \cdot \frac{1}{n}\sum\limits_{i = 1}^n {({\beta _{1,i}} \cdot {d_{ij}})} + {w_2} \cdot \frac{1}{n}\sum\limits_{i = 1}^n {({\beta _{2,i}} \cdot {c_{ij}})} } \right)$$

Next, this paper performs an SVD decomposition of the interaction matrix R_ij because each learner is associated with a vector of real values of potential features of the knowledge point, modeling the bidirectional interaction between the learner and the potential factors of the knowledge point, and each dimension in the potential space is independent of each other, so that the same weights can be used to linearly combine them. Thus, the learner low-dimensional feature projection matrix R_i and the learner behavioral data latent feature matrix R_j can be expressed as Equation (21): (21) $R_{i j} = \sum_{k = 1}^{K} R_{i k} \cdot R_{j k} + ϵ$ $${{R}_{ij}}=\sum\limits_{k=1}^{K}{{{R}_{ik}}}\cdot {{R}_{jk}}+\epsilon$$

Thus, the set of implicitly weak knowledge points Z_ij can be represented as a quintuple, satisfying equation (22): (22) $Z_{i j} (i) = {j : \max_{θ} \sum_{(i, j) \in ℝ} \log p (z_{i, j} | b_{i, j}, R_{i}, R_{j}, σ_{R}, θ_{k})}$ $${Z_{ij}}(i) = \left\{ {j:{{\max }_\theta }\sum\limits_{(i,j) \in \mathbb{R}} {\log } p\left( {{z_{i,j}}\:|{b_{i,j}},{R_i},{R_j},{\sigma _R},{\theta _k}} \right)} \right\}$$

2.3.4

Learning path generation layer

Based on the above analysis, the basic characteristic parameter information of this learner has been obtained through the neural synergistic filtering layer and the cognitive level diagnostic layer, so the fuzzy cognitive diagnostic result of the learner under the model can be expressed as Equation (23) (23) $\hat{y_{u i}} = ρ \cdot X_{i j} + (1 - ρ) \cdot Z_{i j}$ $$\widehat {{y_{ui}}} = \rho \cdot {X_{ij}} + (1 - \rho ) \cdot {Z_{ij}}$$

Since X_ij is a static covariate that remains almost unchanged during the learning session interaction recording, only the dynamic parameter estimation part, i.e., Z_ij, needs to be updated and self-repairing. its final objective function can be expressed as Eq. (24): (24) $L = \frac{1}{2} \sum_{(i, j) \in R_{i j}} {(R_{i j} - β_{1}^{⊤} β_{2})}^{2} + \frac{λ_{1}}{2} {‖ β_{1} ‖}^{2} + \frac{λ_{2}}{2} {‖ β_{2} ‖}^{2}$ $$\mathcal{L} = \frac{1}{2}\sum\limits_{(i,j) \in {R_{ij}}} {{{({R_{ij}} - \beta _1^ \top {\beta _2})}^2}} + \frac{{{\lambda _1}}}{2}{\left\| {{\beta _1}} \right\|^2} + \frac{{{\lambda _2}}}{2}{\left\| {{\beta _2}} \right\|^2}$$

Where λ₁ and λ₂ are the regularization parameters. Using stochastic gradient descent, the learner feature matrix β₁ and the knowledge point feature matrix β₂ can be updated. Specifically, for each (i, j), the updating formula is as in (25), (26): (25) $β_{1 i} \leftrightarrow β_{1 i} + α (e_{i j} β_{2 j} - λ_{1} β_{1 i})$ $${\beta _{1i}} \leftrightarrow {\beta _{1i}} + \alpha ({e_{ij}}{\beta _{2j}} - {\lambda _1}{\beta _{1i}})$$ (26) $β_{2 j} \leftrightarrow β_{2 j} + α (e_{i j} β_{1 i} - λ_{2} β_{2 j})$ $${\beta _{2j}} \leftrightarrow {\beta _{2j}} + \alpha ({e_{ij}}{\beta _{1i}} - {\lambda _2}{\beta _{2j}})$$

The gradient descent method is also used to maximize the log-likelihood function to achieve the update of model parameters. For each model parameter θ_k, its gradient can be calculated by back propagation algorithm as in equation (27): (27) $\frac{\partial Z}{\partial θ_{k}} = \frac{\partial Z}{\partial \hat{p}} \frac{\partial \hat{p}}{\partial θ_{k}}$ $$\frac{{\partial Z}}{{\partial {\theta _k}}}\: = \:\frac{{\partial Z}}{{\partial \hat p}}\:\frac{{\partial \hat p}}{{\partial {\theta _k}}}$$

Where Z is the likelihood function in Eq. (22), $\hat{p}$ $$\hat p$$ is the probability predicted by the model, and θ_k is the k h model parameter. The backpropagation algorithm can recursively compute $\frac{\partial Z}{\partial \hat{p}}$ $$\frac{{\partial Z}}{{\partial \hat p}}$$ and $\frac{\partial \hat{p}}{\partial θ_{k}}$ $$\frac{{\partial \hat p}}{{\partial {\theta _k}}}$$ and multiply them by the chain rule to obtain $\frac{\partial}{Z} \partial θ_{k}$ $$\frac{\partial }{Z}\partial {\theta _k}$$ .

The model parameters can then be updated based on the gradient using gradient descent as in Eq. (28): (28) $θ_{k} = θ_{k} - η \frac{\partial L}{\partial θ_{k}}$ $${\theta _k} = {\theta _k} - \eta \frac{{\partial L}}{{\partial {\theta _k}}}$$

Where η is the learning rate and controls the size of the update step. By continuously iterating the above update process, the optimal model parameters can be obtained.

3

Results and Discussion

3.1

Comparative testing of model performance

3.1.1

Experimental setup

1)

Data sets

In order to verify the generality and generalization ability of the model, public datasets are used for model performance comparison experiments, with data specifications of 100K and 1M, respectively.

2)

Evaluation Indicators

The evaluation metrics are hit rate (HR) and normalized discount cumulative gain (NDCG).HR metrics can reflect the degree of diversity of the recommendation results and the user’s interest in the recommendation results, describing the likelihood of the test items appearing in the recommendation lists generated for the user.NDCG metrics are generally used to measure the average quality of the model’s recommendation lists, which can assess the completeness and accuracy of the recommendation results. The quality of test item hits is rationalized by the ranked position of the test items in the recommendation list, which describes the higher scores of the items.

3)

Baseline Modelling

In order to fully evaluate the recommendation performance of the models, the models in this paper will be analyzed in comparison with a variety of models listed below. The following is a brief description of the comparison models: (1)

GMF: GMF is based on matrix decomposition. The original feature vectors of users and items are first mapped into a lower dimensional space, and then the dot product between these lower dimensional feature vectors is computed in order to estimate the degree of users’ potential preference for items. In addition, a regularization term is introduced during model training to prevent overfitting.

(2)

MLP: MLP (Multilayer Perceptual Machine) is a machine learning model that operates with multiple neural network layers instead of a simple inner product, ensuring that the model can more adequately learn from the data about the nonlinear and complex interactions between users and items.

(3)

OPB-MLP: OPB-MLP is a recommendation model based on a multilayer perceptual machine (MLP). By combining MLP with the outer product, a two-dimensional interaction graph is generated with a sigmoid as the activation function, and the last layer outputs the predicted score.

(4)

DCRM-A: DCRM-A is a deep collaborative recommendation model. The model integrates collaborative filtering and neural networks, while introducing implicit information and attention mechanisms to learn the implicit features of users and items more comprehensively and utilize these features for recommendation.

3.1.2

Analysis of experimental results

The experimental results of the performance comparison of each model on the public datasets are shown in Fig. 2, where (a) and (b) are the experimental results on the ML-100K dataset and ML-1M dataset, respectively. On both ML-100K and ML-1M datasets, it can be seen that the model of this paper outperforms the comparison model in both HR@10 and NDCG@10 performance results.

The model in this paper achieves the best NDCG@K metric performance. The NDCG@K scores on the two datasets are 0.352 and 0.465, respectively. The OPB-MLP model on the ML-100K dataset has the next best results. The fact that OPB-MLP outperforms DCRM-A is just a side-effect of the importance of the outer product for recommendation performance improvement, and the reason why OPB-MLP is not as good as the model in this paper may lie in the lack of the model’s ability to capture nonlinearities of the interaction function due to the use of the MLP. It can be demonstrated that the improved convolutional architecture is used in the process of expressing predictive neural networks in the expression prediction process. However, on the ML-1M dataset, DCRM-A outperforms the OPB-MLP model probably because a large amount of data is more conducive to the extraction of information by the DCRM-A model, and DR-CNN works best because of the improvements made on the basis of DCRM-A. In addition, the DCRM-A model outperforms GMF and MLP on both datasets because DCRM-A combines the advantages of both and introduces an attention mechanism and feedback information.

The change of experimental parameters has an important impact on the model performance, so that the experimental analysis will be carried out next for the main parameters. The main focus is to analyze the effect of changes in K value and the number of feature maps on experimental results. Firstly, the results of different Top-K experiments are analyzed.

Top-K usually refers to selecting the top K items with the highest ratings from all possible recommendation items as the user’s recommendation result. In order to fully validate the performance of the collaborative filtering model after fusion of neural networks, therefore, experimental comparisons will be conducted next on the larger dataset, ML-1M dataset, for HR@K and NDCG@K under different K values. Figure 3 shows the comparison results, where (a) and (b) are the HR@K and NDCG@K metrics, respectively.

It can be observed that the length of the recommendation list has an impact on the accuracy of the model prediction, and as the recommendation list length increases, the recommendation performance of both models improves. Secondly, it is easy to conclude from the figure that this paper’s model is significantly better than the mainstream hybrid collaborative filtering recommendation algorithm (UCF-ICF), and the HR@K and NDCG@K indexes of this paper’s model when the list length is 10 are 0.656 and 0.402, respectively, which are higher than those of the traditional hybrid collaborative filtering algorithm (0.625 and 0.371). It is further demonstrated that the proposed collaborative filtering algorithm using neural networks to enhance the performance of this chapter helps to better model user and item characteristics, and improve the quality of system recommendations. Algorithm performance helps to better model user and item characteristics and improve the quality of system recommendations.

Then, the effect of the number of feature maps on the model performance is verified. Experiments are conducted on the larger dataset ML-1M dataset for different numbers of feature maps to verify the performance of the model in this paper. Fig. 4 shows the results of the analysis of the effect of the change of feature maps on the HR@K and NDCG@K of the model under different epochs; (a) and (b) are the HR@K and NDCG@K metrics, respectively.

All the curves are growing steadily, and although there are some slight differences in the convergence curves with the increase in the number of training times, similar performance is finally achieved, which indicates that increasing the number of neural network parameters under the model framework of this paper does not lead to overfitting.

3.2

Application of Models in English Teaching Practice

The model in this paper was applied to the teaching of English courses in a school, and a controlled experiment was set up to test the model’s effect on teaching efficiency and student learning outcomes.

3.2.1

Teaching experiment setup

The purpose of designing a comparative experiment of precision teaching intervention is to apply the designed precision teaching intervention based on learner profiling to the actual teaching process and to test the effect of this paper’s Learning Path Optimization Model intervention strategy on the enhancement of students’ English learning.

Two classes taught by the same teacher were selected as the experimental class and the control class, both with 50 students. The experimental class used this model to optimize students’ personalized learning paths in the course learning arrangement. The control class was taught using the traditional teaching model. Other than that, the teaching conditions of both classes are the same.

According to the curriculum standard and the requirements of the teaching and research group, a simple understanding of the unit grammar knowledge, focusing on the learning of words, phrases, and parts of speech content, the experiment focused on the teaching of English reading. A teaching quality test was conducted before the beginning of the semester and after the end of the semester as the pre-test and post-test of the experiment.

3.2.2

Learning Paths to Optimize Teaching Effectiveness

The pre- and post-intervention scores of the experimental and control classes were analyzed descriptively, and the results are shown in Table 1. The mean pre-test scores of the experimental class were 56.36 points, and the mean post-test scores were 82.19 points, an improvement of 25.83 points. The pre-test and post-test scores of the control class were 56.94 and 58.43, respectively, and the difference between the pre-test and post-test was extremely small, only 1.49 points. It can be seen that the difference between the pre-test scores of the experimental class and the control class is not large, but the experimental class’s learning achievement progress is significantly larger than that of the control class. It can be preliminarily concluded that the learning path optimization model designed in this paper has a great positive impact on students’ English learning.

Table 1.

The results of the experimental class and the comparison class

	Laboratory class		Control class
	N	X	N	X
Premeasurement	50	56.36	50	56.94
Posttest	50	82.19	50	58.43
Variation	0	25.83	0	1.49

Figure 5 shows the distribution of grades before and after the experimental and control classes. It can be seen that the final grade of the experimental class has significantly improved compared to the pre-semester, and students in all segments have made significant progress. There were no students above 80 points in the pre-test, while there were as many as 11 students above 90 points at the end of the semester. There is no significant trend of student movement in all segments of the control class.

In summary, it can be concluded that the learning path optimization model designed in this paper realizes the accurate judgment of learners’ cognitive characteristics and dynamic learning behavior state and can design scientific and reasonable learning paths on the basis of this in order to help improve the efficiency of teaching and learning effects.

4

Conclusion

This paper constructs a learning path planning model based on multi-source isomorphism under adaptive strategy constraints and applies it to the English classroom for teaching practice. The experimental class has an average pretest score of 56.36 and an average posttest score of 82.19, an improvement of 25.83 points. The average pre-test score for the control class was 56.94, and the average post-test score was 58.43. The difference between the pre-test and post-test scores was very small, only 1.49 points. The difference between the pre-test scores of the experimental class and the control class is not significant, but the experimental class’s academic achievement progress is significantly greater than that of the control class. The experiment demonstrates that the method in this paper can more comprehensively consider learners’ interests, preferences, and cognitive levels, enabling the creation of customized learning paths. Additionally, it can generate further feedback based on learners’ learning effects, allowing for the guidance and updating of these generated learning paths. The model’s effectiveness and feasibility are excellent, which contributes to improving the efficiency of English teaching.

Idioma:: Inglés

Calendario de la edición:: 1 veces al año
Temas de la revista:: Ciencias de la vida, Ciencias de la vida, otros, Matemáticas, Matemáticas aplicadas, Matemáticas generales, Física, Física, otros

RSS Feed de revista

A Study on the Optimisation of Personalised English Learning Paths Based on Big Data Analysis and Its Impact on Teaching Efficiency

Nan Li

Publicado en línea: 05 feb 2025

Recibido: 08 sept 2024

Aceptado: 04 ene 2025

DOI: https://doi.org/10.2478/amns-2025-0064

Palabras claveCollaborative filtering, Neural network, Path optimization, English language teaching

© 2025 Nan Li, published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Palabras clave
Collaborative filtering, Neural network, Path optimization, English language teaching