Construction of dynamic update and adaptive prediction model for user profile based on time series analysis

Since the concept of user portrait was proposed until now, it has gradually developed from a simple, labeled user data profile into a diversified, systematic user abstraction model, which has become the data basis for the operation of major network service platforms and e-commerce enterprises [1-3]. The application of user profile is very common, from mobile operators to large network service providers, they will build a user profile model for the user base, according to this model to improve their own operation strategy, improve user experience, loyalty to achieve the maximization of the interests of the enterprise itself. Everything in real life is accompanied by changes in time, and the periodicity of user profile data is especially obvious [4-7].

Time series analysis also plays a crucial role in numerous practical applications. In economics and finance, it helps to forecast macroeconomic indicators, stock prices and exchange rates. In environmental sciences, it helps in modeling and predicting climate change, pollution levels and weather patterns. In addition, it assists in demand forecasting, marketing strategy development, social media sentiment analysis, disease outbreak prediction and resource planning [8-10]. The versatility of time series analysis makes it an indispensable tool in various fields. Time series analysis uses various methods to extract information and make predictions. Dynamic updating of user profiles based on time series analysis is one of the hot areas of research in the industry and academia, with high commercial and academic value, driving the development of the Internet, promoting the development of productivity in the industry, and bringing great value to the industry [11-12]. User profile time series data is a kind of data collected by sensors at regular intervals under the rated cycle, and then through statistical techniques and algorithms to discover the corresponding laws and get the law of the change of this kind of data over time in order to make informed decisions [13-14].

With the gradual deepening of time series analysis in various fields in recent years, it has been found that different environments, different statistical frequencies, and various natural and man-made factors cause different characteristics of statistical data each time, so that the prediction results and the actual may have a great error [15-17]. The problem also exists in the dynamic update of user image based on time series analysis, in order to solve the problem, adaptive prediction model based on time series analysis can be constructed, adaptive data from different perspectives, extracting the characteristics of user image data, which can intuitively improve the ability of user image data modeling analysis and prediction, and promote the study of user image to the direction of smarter and more personalized [18-20].

Based on the analysis and establishment of the labeling system of user portraits, the study proposes to incorporate a temporal attention mechanism for the prediction of dynamic user labels and to improve the model using feature selection weights. The method utilizes multi-granularity scanning of deep forests to dynamically update the user portrait, and adopts the adaptive combination prediction weight determination method to improve model prediction accuracy.The model accuracy and training time are evaluated under different parameter settings, and the cascade forest module is selected based on the evaluation results. After applying the user image constructed using the method of this paper to the personalized learning path recommendation system for students, the practical application effect of the user image constructed based on the method of this paper is judged by the learning behavior of the students in the learning path recommendation system and the sense of experience of the system.

2

Method

2.1

Deep forest-based prediction model construction

2.1.1

User labeling system

User profiling is the labeling of user information, and the basic element of labeling is “tag”. The so-called label is a short and graphic phrase or word group. A label represents a class of characteristics, and a user profile is a collection of multiple labels. Therefore, the premise of building a multi-dimensional and comprehensive user profile for users is to establish a set of labeling system. The establishment of the labeling system requires people to combine different data and business needs, part of the label is obtained directly from the user’s behavioral data, and part of the data is obtained directly through a series of algorithms or rule mining. For example, the data that users actively fill in and upload on websites or APPs, such as gender, age, and use of models, etc., so that the accuracy of the data is higher. Therefore, improving the labeling system of user profiles is of great significance for building user profiles.

2.1.2

User profile modeling

The purpose of constructing a user profile is to restore information about the user, so the source of labels, which are the basic elements for depicting a user profile, is all the relevant data of the user. If the user base is large and there is a lot of data, user sampling can be carried out first. The data collected by users can be categorized into two categories: static information data and dynamic information data.Static information is the user’s relatively stable information, such as gender, age, geography, and so on.Dynamic information is the user’s changing behavioral information, such as browsing records, contact channels, and so on.Through data mining results for users labeled accordingly, different data information has different statistical and mining methods. User portrait construction first requires the collection of basic data such as network behavior data, user transaction data, etc., and then use text mining, machine learning and predictive algorithms, etc., to model user behavior, and finally construct a portrait for the user in terms of basic attributes, interests and preferences, etc. [21].

2.1.3

Deep forest prediction models

Labels in the user portrait system need to be mined by analyzing user behavior and using machine learning and predictive classification algorithms. In this paper, it is proposed to establish labels in the user portrait system to make predictions using a deep forest model.

Deep forest (DF) is a deep learning method. Compared with deep neural network (DNN), deep forest is easy to train, has small computational overhead, few hyperparameters, does not require complex tuning, can adapt to various sizes of datasets, and has better generalization [22]. Currently, DF is widely used in many fields, proving its robustness in classification and prediction. DF mainly consists of two parts, namely multi-granularity scanning and cascade forest.

1)

Multi-granularity scanning

Multi-granularity scanning is the process of analyzing input features for the purpose of mining the sequential relationships between features. The multi-granularity scanning process is shown in Fig. 1. Multiple sliding windows are used to scan the input features. The input feature vector is scanned by multiple windows, and information is extracted for the features extracted by the sliding windows. Specific process: first, the data containing p-dimensional features are infused, and then features are extracted using a sliding window of length k, and the step size is set to n, so that s k-dimensional feature segments are obtained by the following formula: (1) $s = (p - k) / n + 1$

After that, the feature segments are input into the (Random Forest) RF and Com-pletely-Random Tree Forests (CRTF) models respectively, and the class probability vectors are computed and output, and then the class probability vectors output from all the forests are spliced together to finally generate the transformed feature vectors, which are used as inputs to the cascade forests.

2)

Cascade forest

Cascade forest consists of multiple cascade layers, each cascade layer contains two RFs and two CRTFs, and each RF and CRTF contains m trees, and the structure of cascade forest is shown in Fig. 2. For the classification of each cascade layer, in RF, each node randomly selects features, and then the feature with the largest Gini index is used as the condition for the division of that split node. Meanwhile, in CRTF, each node randomly selects the feature, and then uses that feature to divide and generate child nodes until each leaf node contains only samples of the same class. The class distribution of each forest is estimated by the proportion of different kinds of samples in the leaf nodes, and then the output results are averaged for all the trees in the forest to get the class distribution vector of each forest, and a total of four forests are obtained, therefore, assuming that there are m kinds of classification of the dataset will be obtained as a 4m-dimensional feature vector. The results of the previous and current layers are spliced as inputs to the next layer, and the final class vector is obtained by taking the mean of all the vectors in the last layer, and the class with the highest probability is used as the prediction result of the sample.In order to avoid overfitting of the model, the 10-fold cross-validation method is chosen for training, and the training process will be stopped automatically if there is no significant improvement in the performance.

2.2

Dynamic update of user profiles based on time series analysis

For the prediction of users’ dynamic interest labels, this subsection uses the temporal attention mechanism to focus on the key feature information in the model input stage based on the deep forest model to further optimize the model. The improved multi-granularity scanning structure as a feature processing module that incorporates the temporal attention mechanism can be seen in Fig. 3.

This is done by calculating the ratio of the user dwell time of each example item contained in each interest cluster in each user packet to the sum of the user dwell times of all examples in that packet, and using the resulting weight as the attention weight for that interest cluster. For each user behavior log Session_i = {x₁,x₂,⋯,x_n}, which corresponds to a click time of t_i = {t₁,t₂,⋯,t_n}, then the dwell time for each item is calculated as: (2) ${\tilde{t}}_{n - 1} = t_{n} - t_{n - 1}$ (3) $T_{i} = {{\tilde{t}}_{1}, \dots, {\tilde{t}}_{n - 1}, \max (\tilde{t})}$

Based on the importance of the last item in the serialized recommendation algorithm, the x_n dwell time is set to be the maximum dwell time $\max (\tilde{t})$ in T_i. The weight vector of the time-attention mechanism is specifically computed from Eqs. (2) and (3): (4) $V_{i} = \sum_{j = 1}^{N} W_{j} * V_{j} = \sum_{j = 1}^{L_{j}} f (T_{j}) * x_{j}$ (5) $f (T_{j}) = \frac{T_{j}}{\sum_{i = 1}^{L_{j}} T_{i}}$

2.3

Model improvement based on adaptive weights

Compared with the single demand forecasting model, the combination forecasting model can have higher forecasting accuracy, and this chapter will investigate how to combine each single demand forecasting model with a suitable parallel combination and optimize the weight allocation based on the objective function of root mean square error.

In this paper, we adopt the adaptive combination prediction weight determination method, which corresponds to the weight coefficients of the combination prediction model that change with the prediction time nodes, and it can use the prediction value of the previous step as the input for predicting the prediction value of the next step, and constantly update the weights [23].

In the combined model of this paper, an important factor affecting the forecasting performance of the model combination is a set of weight combinations ω_m, m = 1,…,M. M is the number of individual models in the combined model, i.e., the number of M is 2. In this paper, a new scheme for determining the weight values is adopted, which captures the time-varying dynamics of the underlying time series.

The evaluation index of whether the weight allocation is reasonable is the root mean square error MSE, which is minimized when the predicted value is the same as the true value. When the sum of sub-model weights is satisfied to be 1 and the average absolute error is the smallest, the combined prediction error can be minimized and the weights of the ith prediction model at the corresponding t moments are obtained. The formula is shown in equation (6): (6) ${\begin{array}{l} \min M S E = \frac{1}{n} e_{t}^{2} = \frac{1}{n} \sum_{i = 1}^{d \times w} {(λ_{i t} e_{i t})}^{2} \\ s . t . \sum_{i = 1}^{n} λ_{i t} = 1, λ_{i t} \geq 0 \end{array}$ where ${\hat{y}}_{t}$ is the demand forecast value at time t, y_t is the actual demand value at time t, d is the window width of the proximity history sample representing the selected number of similar weeks, and w is the number of forecast points corresponding to the unit time.

When the error value of the k th demand prediction sub-model at time t in window width d is always minimized, the result of Eq. (6) can be expressed as Eq. (7): (7) ${\begin{array}{l} λ_{k t} = 1 \\ λ_{i t} = 0, t - 1, 2, \dots, n; i \neq k \end{array}$

The result of Eq. (6) can be expressed as Eq. (8) when time t minimizes the prediction error of the k₁rd model in some weeks in window width d and the k₂th model in some other weeks: (8) ${\begin{array}{l} λ_{k_{2} t} = \frac{| e_{k_{2} t} |}{| e_{k_{1} t} | + | e_{k_{2} t} |} \\ λ_{k_{2} t} = \frac{| e_{p_{1} t} |}{| e_{k_{2} t} | + | e_{k_{2} t} |} \end{array}$

The weight values of each model can be calculated by Eq. The calculation is based on Eq. (7) and Eq. (8): (9) $\begin{array}{l} λ_{i (p + 1)} = \frac{1}{p} \sum_{t = 1}^{p} λ_{i t} \\ λ_{i (p + 2)} = \frac{1}{p} \sum_{t = 2}^{p + 1} λ_{i t}, \dots, λ_{i (p + j)} = \frac{1}{p} \sum_{t = j}^{p + j - 1} λ_{i t} \end{array}$

Where p is the amount of basic data for the chosen time t.

And the determination of the sample window width d is also a key factor that affects the effect of the combination weights. It is assumed that the demand forecast data of N similar weeks prior to the week to be measured is utilized for the determination of the sample window width d of the proximity history, and there are a total of n forecast points in the demand forecast data.

Let ε_kt be the MAE value of the k nd single prediction method in week t, ε_t be the average absolute prediction error of all the prediction methods in week t, and δ_j be the average absolute prediction error of all the prediction methods in the j th window period. First calculate the MAE value during the first window period. Next, slide one historical week and calculate the mean absolute error during the next window period. Repeat the process until sliding to the last window. A suitable window width can keep the MAE value within a stable range and ensure that the output of the combined pre-model has a certain degree of fit with the real data.

The specific steps for selecting the sample window width are as follows.

1) Calculate the MAE values for the M demand forecasting sub-models for week t: (10) $ε_{t} = \sum_{t = 1}^{M} ε_{i t} / M, t = 1, 2, \dots, n$

2) Assuming that the window width is d and the number of forecast points per week is q, the MAE value during the j rd window period can be calculated through equation (11) as: (11) $δ_{j} = \sum_{t = (j - 1) q + 1}^{(d + j - 1) q} ε_{t} d, j = 1, 2, \dots, N - d + 1$

Since the weights need to be updated at each prediction time, an optimization method with fast computation and high accuracy is needed to obtain accurate weights. In this paper, the Bayesian optimization (BO) algorithm is used to optimize the weights of individual models.

The flowchart of the Bayesian optimization algorithm is shown in Figure 4.

In this paper, the collection function of the Bayesian optimization algorithm is used to measure the gain of sample point selection, when we use the sample points in the training set to model a certain function and obtain the corresponding probability model, the collection function can determine the next sample point selection strategy. The acquisition function a(x) reflects the gain of the newly selected sample points at x for model optimization, and by solving argmax(a(x)) through the optimization algorithm, we can get the corresponding sample point selection position with the highest gain. In this paper, the EI (Expected Improvement) function is used as the collection function, and the function is constructed as shown in equation (12): (12) $\begin{matrix} E I (x) = {\begin{array}{l} (μ (x) - f (x^{+})) Φ (z) + σ (x) Φ (z) & σ > 0 \\ 0 & σ = 0 \end{array} \\ z = \frac{μ (x) - f (x^{+})}{σ} \end{matrix}$ where Φ(z) is the normal cumulative distribution function and f(x⁺) is the maximum value available.

The computation process of the adaptive weighting strategy is as follows:

1) Determine the window width d, and write down the actual demand sequence of a similar week before the week to be tested as Y_j = {y_j1,y_j2,…,y_jw}, and the actual demand sequence of d similar weeks before the week to be tested as Y = {y₁,y₂,…,y_d}, where the predicted demand data sequence obtained by the k th single prediction model is Y_i = {y_i1,y_i2,…,y_id}.

2) For Y and Y_i, i = 1,2, find the combination weight (ω₁,ω₂) during the window period based on the Bayesian hyperparametric optimization algorithm, and use this combination weight as the combination weight for the demand forecast of the week to be measured.

3) Calculate the demand forecast value using the above forecast data and the corresponding weights, as shown in Equation (13): (13) $x = \sum_{k = 1}^{M} ω_{k} x_{k}$

The combination weights are updated based on the previous data before each demand forecast calculation, so that each single model can better utilize the strengths of their respective models, and so that the combination forecasts have more reasonable weights.

2.4

Evaluation indicators

The experiments in this paper use four standard evaluation metrics: accuracy rate, check rate, F1 value and AUC.

The accuracy rate represents the ratio of the number of correctly predicted samples to the total number of predicted samples and is calculated as: (14) $A c c u r a c y = \frac{T P + T N}{T P + T N + F P + F N}$

The check rate represents the ratio of the number of correctly predicted positive samples to the number of all samples predicted to be positive and is calculated as: (15) $\Pr e c i o u s = \frac{T P}{T P + F P}$

F1 was chosen as a measure of model wholeness and was calculated as: (16) $F_{1} = \frac{P_{t} \times R_{e}}{P_{t} + R_{e}}$

The AUC value provides a very intuitive assessment of model performance, with values closer to 1 indicating better model classification.

3

Results and discussion

3.1

Experimental analysis

The construction of the forest in the deep forest model is the core of the model establishment, and the construction of the decision tree is the core of the forest, so the number and depth of the decision tree in the forest will directly affect the training efficiency and classification effect of the model. Deep forests can cascade multiple models, and diversity is especially critical to model design, so this paper tries to cascade multiple forest models in logistic regression (LR), random forest (RF), extreme random tree (ET), gradient boosted tree (XGB), and determines the model types and hyperparameters through experiments.

The performance of model accuracy evaluation with different settings of n_estimate parameter is shown in Table 1. It shows that each forest model as a whole exhibits a trend of increasing accuracy with increasing the n_estimate parameter and then smoothing out. Among them, the RF and XGB models have comparable prediction accuracies, with mean accuracy values of 0.8909 and 0.8910, respectively.

Table 1.

Model accuracy assessment performance

Parameter	RF	ET	XGB
5	0.8898	0.8852	0.8906
15	0.8905	0.8863	0.8908
25	0.8908	0.8867	0.8912
90	0.8914	0.8872	0.8917
120	0.8911	0.8867	0.8909
150	0.8911	0.8869	0.8907
180	0.8914	0.8877	0.8911

Due to the time overhead associated with the increase in the n_estimate parameter, the model training time evaluation performance is shown in Table 2. It can be clearly seen that the larger the n_estimate parameter, the longer the training time of the model. When the parameter setting increases from 5 to 180, the training time of RF, ET, and XGB models increases by 90.56s, 434.91s, and 936.59s, respectively.

Table 2.

Model training time assessment performance

Parameter	RF	ET	XGB
5	34.51	34.51	62.07
15	44.06	77.88	386.61
25	58.94	113.87	465.48
90	77.39	156.79	544.03
120	87.5	210.65	698
150	103.96	317.6	852.12
180	125.07	469.42	998.66

In addition, we compare the models with respect to the variation of the parameter maxdepth. The accuracy metrics are evaluated as shown in Table 3. As the maxdepth parameter increases, the model accuracy not only does not improve, but also decreases. When the parameter setting increases from 5 to 180, the RF, ET, and XGB model accuracy decreases by 0.0021, 0.0166, and 0.0036, respectively.

Table 3.

Accuracy evaluation

Parameter	RF	ET	XGB
5	0.8897	0.8635	0.8907
15	0.8898	0.8837	0.8869
25	0.889	0.8812	0.8866
90	0.8882	0.8811	0.8875
120	0.8882	0.8808	0.8873
150	0.888	0.8804	0.8872
180	0.8876	0.8801	0.8871

The performance on training time is shown in Table 4. For the RF and ET models, the increase in the maxdepth parameter brings no time overhead, but the XGB model shows a very significant increase in runtime with the increase in the maxdepth parameter, which increases from 68.63s to 492.63s when the parameter is increased from 10 to 100.

Table 4.

The performance of the training time

Parameter	RF	ET	XGB
10	21.89	32.33	68.63
25	24.4	44.71	297.96
40	23.43	30.35	485.38
55	24.01	27.02	537.97
70	23.99	27.02	557.08
85	23.02	29.29	554.83
100	21.89	38.67	492.63

By comprehensively analyzing the model performance, the hyperparameters of the three models were set as shown in Table 5.The hyperparameters of XGB, ET, and RF were set to 5, 25, and 15, respectively.

Table 5.

The parameters of the model parameters in the cascade forest

Model	n_estimate	maxdepth
RF	15	15
ET	25	25
XGB	5	5

Each of the three models mentioned above has its own advantages in classification performance: the RF model has lower variance and bias, and thus has the highest accuracy and the fastest training efficiency in the experiments; the ET model has a further reduction in variance relative to RF, and the bias has increased, resulting in a slight decrease in classification accuracy; XGBoost, as a typical representative of gradient boosting integrated learning algorithms, has a very high accuracy, but the the time overhead is relatively large. The diversity of cascade models directly affects the classification effect, in this paper, by cascading more than one model, we get the experimental results of multiple cascade forests as shown in Table 6. The accuracy rate of RF+ET+XGB model reaches 89.81%, and taking into account the evaluation indexes such as the model accuracy rate and the running time, we choose three models, namely RF, ET and XGB, to form the cascade forest module of the deep forest.

Table 6.

Various cascade forest classification prediction assessment

Cascade forest	F1/%	Accuracy/%	Training time/s
RF+ET+XGB+LR	93.57%	88.83%	242
RF+ET+XGB	93.32%	89.81%	93
RF+ET	93.00%	88.52%	21
RF+XGB	93.09%	88.86%	72
ET+XGB	92.01%	88.65%	79

In order to highlight the advantages of the deep forest algorithm, based on the above dataset, this paper introduces the traditional machine learning algorithms: logistic regression (LR), support vector machine (SVM), decision tree (DT), deep convolutional neural network (DCNN) and the integrated algorithms Random Forest (RF), and XGBoost for predicting and comparing the part of hyperparameters of each algorithm. The performance of each model on different metrics is shown in Table 7. It can be seen that the deep forest model performs better than traditional machine learning algorithms in predicting behavior, and compared with deep convolutional neural networks, although the model’s advantage in prediction accuracy is not obvious, the training time of the deep forest model is about 1/20 of that of the deep convolutional neural network.After incorporating the temporal attention mechanism into the deep forest model and implementing the adaptive weight combination strategy, the model prediction accuracy reaches 92.3%, the method in this paper significantly improves the model prediction accuracy.

Table 7.

The performance of each model on different indicators

Model	F1	Accuracy/%	Precision/%	AUC	Training time/s
LR	0.894	83.62	91.94	0.731	7.18
SVM	0.925	87.4	91.07	0.803	187
DT	0.915	88.52	88.38	0.898	2.56
XGBoost	0.941	88.9	91.12	0.911	12.3
LightGBM	0.903	89.07	91.83	0.888	7.13
DCNN	0.902	89.1	92.11	0.887	1211
DF	0.909	89.9	89.08	0.833	61.12
DF+TAM+AWS	0.934	92.3	93.11	0.891	67.99

3.2

Application analysis

3.2.1

Experimental setup

The user profile constructed based on the method of this paper is applied to the learning path recommendation system. The study takes 90 learners as research subjects and randomly categorizes them into two groups, the experimental group and the control group.The learners in the experimental group adopted a recommendation system in which user profiles constructed based on the method of this paper were applied to recommend learning paths.The learners in the control group adopted the user profiles constructed by applying a single model in the recommender system for learning path recommendations.

Both groups of learners learn Chinese topics and cultural topics in the recommender system, which contains 66 topic items for 23 Chinese topics, corresponding to 120 tasks, and 77 learning tasks for 32 cultural topics.

3.2.2

Learner Learning Behavior Analysis in Learning Path Recommender System

In this experiment, four learning behavior patterns were set for better testing:

1) Total learning time (t_total).

2) Total number of clicks on learning objects (n_total).

3) Test scores (g_tests): the average of each test score.

4) Ratio of the number of visits to recommended learning paths to the number of visits to unrecommended learning paths (n_rec lp/n_unrec lp).

5) The ratio of the time spent visiting recommended learning objects to the time spent visiting unrecommended learning objects (t_rec lo/t_unrec lo).

6) The ratio of the number of visits to recommended learning objects to the number of visits to unrecommended learning objects (n_rec lo/n_unrec lo).

In this study, the data on online learning behaviors generated by the experimental and control groups were statistically analyzed in SPSS software using the independent samples t-test to determine whether there is a significant difference (confidence level of 0.05), and the experimental results are shown in Table 8. The learning achievement and the number of visits to the learning path in the recommender system are not particularly different in the two groups, most likely because the learners in the control group visited a larger number of learning objects, which included those recommended by the recommender system, in the allotted time. The total learning time and the total number of clicks of the learners in the experimental group are less than that of the control group, while the P-value of the total learning time is 0.013, which is less than 0.05, indicating that there is a significant difference between the two groups in terms of the learning time, and similarly, the probability of the total number of clicks of the learning objects is P of 0.005<0.05, which indicates that there is a significant difference between the two groups. This shows that the user profile constructed based on the method of this paper can be applied to the recommender system to optimize the learning process and improve learning efficiency.

Table 8.

Experimental results

Behavior pattern	Experimental group	Control group	P
t_total	37965	38044	0.013
n_total	2365	2306	0.005
g_tests	7.18	6.84	0.236
n_rec lp/n_unrec lp	2.74	2.59	0.049
t_rec lo/t_unrec lo	2.75	0.59	0.002
n_rec lo/n_unrec lo	3.66	0.94	0.003

3.2.3

Learner Satisfaction Survey Analysis in Learning Path Recommender System

In order to understand the learners’ use and subjective evaluation of the user profile constructed based on the method of this paper after applying it to the personalized learning path recommendation system, this study conducted a questionnaire survey on the learners in the experimental group.

The questionnaire adopts a five-point scale with a score of 1-5, from high to low indicating strongly agree, agree, generally, disagree and strongly disagree, and the score represents the degree of satisfaction of the students with the recommender system. Forty-five questionnaires were actually distributed and 45 were returned with a recovery rate of 100%. The questions designed for the questionnaire are shown below:

Q1: The learning resources corresponding to the paths recommended in the system are exactly what I want.

Q2: The sequence of learning activities recommended by the system is exactly in line with my learning habits.

Q3: I like the personalized learning path provided by the recommendation system.

Q4: It is easier for me to learn according to the recommended path.

Q5: I can find learning resources faster according to the recommended path.

Q6: My study time has decreased after using the system.

Q7: After using the system, the number of times I study has increased.

Q8: After using the system, my study goals are clearer.

The statistical results of the questionnaire are shown in Figure 5. The percentage of those who agreed with the above eight questions was 86.1%, 92.5%, 88.2%, 82%, 90.6%, 89.2%, 84.4% and 84.8% respectively. It is evident that the majority of learners have a positive attitude towards the user profile created by the method of this paper when it is implemented in the personalized learning path recommendation system.

4

Conclusion

The establishment of labels in the user profile system is predicted by this paper using the deep forest model. It also incorporates the time-attention mechanism into the model for dynamic updating, and designs an adaptive weight combination strategy to optimize the model. The accuracy of recommendations can be significantly improved by constructing a user profile based on the above method, and the model prediction accuracy can reach 92.3%. After being applied to the personalized learning path recommendation system for students, it plays the role of optimizing the learning process and improving learning efficiency. The percentage of students who agree with the recommendation effect is between 82% and 92.5%. It shows that the user profile constructed based on the method of this paper can have a good predictive effect.

Funding:

This research was supported by the Teaching Reform Research Project of Ordinary Undergraduate Universities in Hunan Province: Design and Practice of Collaborative Learning Activities for the “Java Programming” Course Based on Student Portraits (Project Number: 202401001871).

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 1 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Biologie, Biologie, andere, Mathematik, Angewandte Mathematik, Mathematik, Allgemeines, Physik, Physik, andere

Zeitschrift RSS Feed

Construction of dynamic update and adaptive prediction model for user profile based on time series analysis

Jin Li

Pin Zhong

Online veröffentlicht: 17. März 2025

Eingereicht: 26. Okt. 2024

Akzeptiert: 08. Feb. 2025

DOI: https://doi.org/10.2478/amns-2025-0295

SchlüsselwörterDeep forest model, Temporal attention mechanism, Adaptive weighting, User portrait

© 2025 Jin Li et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Schlüsselwörter
Deep forest model, Temporal attention mechanism, Adaptive weighting, User portrait