Optimization and Recommendation System Design of Digital Resources for Civic and Political Education for College Students
Published Online: Sep 25, 2025
Received: Jan 12, 2025
Accepted: Apr 30, 2025
DOI: https://doi.org/10.2478/amns-2025-1007
Keywords
© 2025 Song Du, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
To do a good job in the Civic and Political Theory Course, the most fundamental thing is to fully implement the Party’s education policy, and solve the fundamental problem of what kind of people to cultivate, how to cultivate people, and for whom to cultivate people. Civic and political theory class is an important platform for conducting civic and political education, an important channel for publicizing socialist core values, and an important position for strengthening the learning of socialist thought with Chinese characteristics in the new era [1-2]. The level and effect of the Civic and Political Theory Class affects the effectiveness of Civic and Political Education in colleges and universities. Enhancing the teaching effectiveness of the Civic and Political Theory Class and promoting the innovative reform of the Civic and Political Theory Class in colleges and universities depends on the medium of information technology, the construction of an integrated team of teachers in teaching and research, and the promotion of the optimization of the digital resources of the Civic and Political Education as a way to enhance the ideological, theoretical and practical effectiveness of the Civic and Political Theory Class in colleges and universities [3-6]. On the basis of the theoretical research on the efficient integration of information technology and education and teaching of civic and political education, summarize and explore the practical experience of teaching civic and political education in colleges and universities, and provide a new development path for improving the effectiveness of education and teaching of civic and political education [7-9]. At present, most of the college civic education using large classes, open classes for collective teaching, there is a single form, poor relevance, lack of synergistic effect, unable to form a personalized collaborative nurturing mechanism and other issues, based on the above shortcomings, combined with the new situation of the college civic education digital resources optimization needs, you can design a college civic education course recommendation system [10-13]. The system follows the generalized design idea of software engineering, and the development and application of the system provides a new way for the development of college civic education, which is conducive to improving the comprehensive quality of college students in the civic education course, and has positive significance in creating a collaborative win-win situation for teachers and students, diversified forms, and personalized innovation in the atmosphere of civic education in colleges and universities [14-17].
This paper proposes an improvement to the similarity calculation method of collaborative filtering algorithm by adding a popular resource penalty factor and a time decay penalty factor to the similarity calculation. At the same time, the content-based recommendation algorithm is utilized to extract features from the labels of the course Civics resources. The similarity of the content-based and user-based recommendation algorithms are fused with certain weights as the similarity result of the hybrid recommendation algorithm to predict the recommendation effect of the course Civics and Political Science resources. Finally, the accuracy and adaptability of the hybrid recommendation algorithm are analyzed.
In this paper, we choose the behavioral dataset of college students of an online course of an educational platform, which can be used as a data source for the big data direction competition on the one hand, and on the other hand, it can be used by researchers to explore the ideological and political education industry. In this paper, this online platform’s student users choose the Civic and Political Education course learning dataset to carry out research. In order to accurately recommend the Civic and Political Education course resources to the users, it is necessary to understand the content of each dataset table, which in turn facilitates the pre-processing of the data.
The dataset that has not been preprocessed has a large number of feature columns that cannot be used in the later training, as well as containing some missing values of scoring, which is not convenient for the later training of the improved algorithmic model, so the original dataset needs to be cleaned, and if part of the data format in the dataset can not be used by the algorithmic model, then it is necessary to convert the data format at this time. After the above steps, the data content may be relatively discrete, without a certain logic, you can merge some of the feature columns in the data set through data integration to get the final experimental data set.
Data Cleaning Usually, the first step of most dataset preprocessing is data cleaning, in which there are several cases: the existence of missing values, invalid feature columns, duplicate data, etc., so that the algorithmic model can use the dataset more efficiently, and the dataset can be operated as follows:
Deletion of useless feature columns. In some datasets, there is a feature column with null or duplicate values, and this type of data cannot be used as valid information. Therefore, this paper deletes the feature columns of registration time, user password, user name, login time, and number of concerned courses in the user’s personal information dataset; course entry time and instructor’s name in the course resource information dataset; and whether or not to evaluate the course in the user’s course selection dataset, so as to facilitate the subsequent further operations on the dataset. Filling of missing values. There is a large amount of missing rating data in the user course evaluation information dataset, and the missing value processing methods often used include: filling in the missing content, deleting certain feature columns with missing values, and directly using the dataset with missing values for training, and the latter two processing methods will cause information loss, which will affect the final experimental results. Therefore, in this paper, we mainly use to complement the missing content of the data. Data transformation Data normalization, discretization and feature coding are some common operations of data conversion, aiming at processing data into a format structure that can be understood by the algorithmic model. In this paper, we mainly focus on the features listed as the length of the course and the length of the study course to make a unified specification. Data Integration This paper focuses on the study of user course rating behavior data, and this data set is composed of four scattered data sets, so it is necessary to integrate the processing of these four data sets, the correspondence between the four data sets is shown in Figure 1.

Schematic diagram of data integration
In order to improve the real-time, accuracy and applicability of the recommendation model of Civic Education, this paper proposes a hybrid recommendation algorithm based on collaborative filtering and resource content. The flow of this hybrid recommendation algorithm is shown in Figure 2. Since the college students of the Civic and Political Education Learning System are relatively fixed and at the same level, and there are a large number of resources with common ratings, Pearson’s similarity is used as the similarity calculation method of the recommendation model. However, this method cannot solve the cold-start problem caused by the sparse data of college students and resources, so the collaborative filtering mechanism of the penalty factor is introduced to improve the Pearson correlation coefficient, which makes up for the shortcomings in data sparsity. Regarding the cold-start problem of the system, relevant labels can be given to the Civics resources when they are uploaded. A content-based recommendation algorithm is used to extract the features of these labels, analyze the similarity between the target resource and other resources, predict the ratings of the new resource, and use the predicted ratings to populate the original college student-Civics resource matrix. The cold-start problem [18-19] can be solved effectively.

Hybrid recommendation algorithm▢
In order to improve the recommendation accuracy of the algorithm, this paper introduces two penalty factors on the original Pearson similarity calculation [20-21], which are the popular resource penalty factor and the time decay penalty factor, and the steps of similarity calculation improvement are as follows:
The first step is to construct the user-resource scoring matrix. The construction of the matrix is completed according to the number of users and resources in the dataset and the rating information of users on resources.
In the second step, the Pearson correlation coefficient is used to calculate the similarity between the Civic resources, as shown in equation (1). In the formula,
In the third step, the weights of the popular resources penalty factors are added. The course Civics learning system in this paper aims to personalize the recommendation of high-quality Civics resources that students are interested in, so it is necessary to reduce the influence of popular resources on the similarity, and appropriately increase the weight of the ratings of nonpopular resources to achieve accurate recommendation. The introduced weights are shown in Equation (2), where
The fourth step, adding a time decay penalty factor, is mainly for the user to browse the resources with the change of time, to the Civics course, for example, the teaching progress with the advancement of time, students learn the content and technical points are also in-depth, and each chapter of the course Civics elements and specific knowledge points are closely related. So a penalty factor should be added to these resources with too long a grading interval. The time decay weights are shown in equation (3), where
Combining the above two penalty factors on the Pearson correlation coefficient not only reduces the scoring ratio of active resources, but also reflects the transfer of user preferences at different times. The similarity calculation formula after combining the two penalty factors is shown in equation (4).
In order to solve the cold-start problem that exists when college students enter the system and produce accurate recommendation results, it is also necessary to optimize and quantify the content information of the resources. Label is used to describe the content and form of resources in a generalized way, giving labels to resources is conducive to users to quickly filter and browse the resources they want, and also facilitates the system’s mining and analysis of resource data.
Firstly, the weight value of the label on the resource is defined and the resource-label matrix is constructed as shown in equation (5). Where
After constructing the resource-label matrix, the weight value of the label to the user is calculated as shown in equation (6). Where
Then the user’s interest features and the label attributes of the resources are vectorized using the method of spatial vector representation, where the user’s interest features can be denoted as
Combining the improved Pearson similarity
According to the improved hybrid algorithm similarity calculation to find the set of nearest neighbors of the target resource, the predicted score of user
Finally, for the target user, traverse all the itemsets, select the set of resources for which no behavioral records have been recorded yet, and generate the to-be-recommended list by sorting the resources in descending order of their predicted scores. Finally, the top N resources in the to-be-recommended list are selected as the Top-N recommended list.
Depending on the recommendation task, Mean Absolute Error MAE and Root Mean Square Error RMSE are commonly used to evaluate the accuracy of predicting the user’s ratings of items, while Accuracy Precision and Recall Recall are used to evaluate the performance of recommendation lists generated by recommendation systems.
Accuracy Precision is used to indicate how many of the recommended lists are items that the user actually likes, and is defined as shown in Equation (10).
where
The Recall metric is used in recommender systems to measure the proportion of the number of user’s favorite items included in the recommendation list, which is defined as shown in Equation (11), i.e., the number of user’s favorite items in the recommendation list is divided by the number of all items that the user actually likes. Therefore, a higher Recall value indicates that the recommender system can better cover the user’s preferences and provide more recommended items that match the user’s interests.
The online resource types of the learning resource recommendation system commonly include test questions, documents, PPT, video, audio, source code, pictures, graphics, links to experimental platforms and other types. According to the resource type, “sourceType” is set in the database for labeling. The results of the categorization of the information in the sourceType field of the digital resources for Civic and Political Education are shown in Table 1 below. Because it involves the running system, the table name is not provided, and only some examples of field information are provided. Each type of resource in the table will be equipped with detailed information labeling fields available, such as name, number, original file name, storage file name, etc.. This information annotation, categorization and automatic marking are used for better management and utilization of learning resources.
Learning resource classification
Data table English name | Database source type ID |
---|---|
Documentation | SourceType=6 |
Video | SourceType=1 |
Audio frequency | SourceType=3 |
Source code | SourceType=5 |
Picture | SourceType=2 |
Text | SourceType=4 |
In the process of video labeling, in addition to the common test attributes such as the logo of the video, question number, title, difficulty, etc., the difficulty and similarity attribute fields are also labeled in this paper for better management and utilization of learning resources. The collaborative filtering hybrid recommendation algorithm for Civic and Political Education digital resources designed in this paper is based on the task of completing intelligent video recommendation according to learners’ video learning. On this basis, this paper further designs and implements the requirement analysis and functional design related to learning resource recommendation in the intelligent assessment system.
This paper draws on the ideas of student stratification and resource partitioning, and uses statistical methods to categorize the learning styles and growth patterns of students in a class based on the historical click-through rates of Civic Education videos in a university.
The students were categorized into “Class A students-active learners, Class B students-potential learners, and Class C students-inactive learners” according to the active degree of clicking on the videos. The data curves of the student categories are shown in Figure 3. We can determine the learner identity labels of the three groups and summarize their group characteristics:

Historical data curve
Compared with the other two groups, learners in Group A tend to watch more educational video resources, and their learning ability values are always at a high level; we define learners in this group as active learners. In general, they are not only highly engaged in learning, but also more capable of learning. Compared with Group A, learners in Group B watched significantly fewer educational video resources. However, the proficiency of learners in Group B is still high and almost equal to Group A. We refer to the learners in this group as potential learners. Although they have only studied a small number of educational video resources, they are highly competent and have a great potential to be tapped. Learners in Group C participate in the least number of interactive educational video resources and have a relatively low competency value. Compared with the other two groups, they are not active in the whole learning process and their learning effect is not good, we define this group of learners as inactive learners. They seldom participate in the learning of educational video resources, and their own ability is weak, so it is difficult to generalize a regular learning pattern from their learning behavior.
The frequency of video clicks of the three types of students is greater for category A than for category B than for category C. Their average accuracy rates are 75.89, 66.11 and 58.17 times respectively. For category A students, the main focus is to spur and stimulate interest, and to reduce the recommended degree of less difficult test questions. To satisfy the challenging nature of their knowledge and ability growth and to stimulate learning motivation. For the B students with moderate foundation, according to the “three-zone theory” proposed by Prof. Nordic, we recommend the test questions with moderate difficulty for them to ensure the consistency of their learning. For C students, encouragement is the main focus, and the degree of recommendation of difficult questions is reduced to protect their learning motivation.
In this paper, 2597 logs with almost 0 engagement were removed from 25000 available learning behavior logs. Finally, the number of active learners in this experiment is 277, the number of potential learners is 534, and the number of inactive learners is 358, and the corresponding number of behavioral logs of the three groups of learners is 13458, 7848, and 1097, respectively. The performance analysis of the experiment is divided into three main elements:(1) analyzing the basic performance of the three groups of learners in precision metrics, and investigating the relationship between the part of metrics with the recommendation list length; (2) compare the performance of the three groups of learners on non-precision indicators, highlighting the effectiveness of the personalized exploration strategy; (3) take active learners as the target group, and compare the performance differences between the collaborative filtering hybrid recommendation algorithm proposed in this paper and the existing recommendation algorithms in various aspects.
The results of video similarity at different recommendation list lengths (N) are shown in Figure 4 below. At N=10, the video similarity coefficient at the beginning is lower than N=20, but as the video click rate is around 1100, its similarity then exceeds that at N=10, up to more than 0.9. At N=5, the video similarity results recommended by the hybrid algorithm are around 0.4, which is always lower than that at N=10 and N=20. It is obvious that the learners are highly engaged and capable in the learning process, active learners perform better in the learning process, and the improved similarity prediction results of the hybrid recommendation algorithm with closer to the real results. It also proves that the longer the recommendation list is, the more accurate the recommendation results of the hybrid algorithm for video similarity.

Different recommendation list length of video similarity results
In this paper, the cumulative concentration rate of students on videos is investigated under different recommendation lists. Specifically in the recommendation algorithm experiments, a student is selected to combine his own historical learning data through the historical data curve of his category of Civic Education. A hybrid recommendation algorithm was used to recommend exercises for him. Pearson similarity is used in the recommendation process to predict the correct response rate of the student facing the next round of tests and optimize the Top-N recommendation list, and the response experiment is conducted after completing the recommendation.
The results of resource recommendation for three different groups of learners are shown in Figure 5 below. It can be seen that the accuracy of the resource recommendation results increases as the strength of the recommendation list (N=5, N=10, N=20) increases. The cumulative video resource hit rate is very high for both active learners and potential learners. The potential learners had the highest cumulative hit rates for N=5, N=10, and N=20 (96.28%, 98.65%, and 99.03%), followed by active learners (95.44%, 96.97%, and 98.53%), whereas the inactive learners had very low cumulative hit rates (39.62%, 43.41%, and 47.08%). The reason for this is that the first two types of learners provide sufficient behavioral logs that can be used to mine their learning rules, thus greatly increasing the likelihood of recommendation success. On the contrary, the inactive learners are much less engaged and capable than the other two groups. It is therefore difficult to summarize their learning patterns and accurately recommend Civic Education videos for them.

Recommendations for resources for three different learners groups
In this paper, active learners, potential learners and inactive learners are explored in terms of accuracy, recall and

The algorithm recommended the accuracy of the results
The ZPD theory states that in the next stage of learning, learners need to be recommended challenging content to motivate them and their potential to reach a higher cognitive level. Therefore, in order to verify the scoring effect of the collaborative filtering hybrid recommendation algorithm model proposed in this paper, the difficulty level of the recommended content is investigated. The students’ adaptive results of the recommendation algorithm are shown in Figure 7. Combined with the ZPD theory, we specifically analyze the impact of the hybrid recommendation algorithm proposed in this paper after improving the similarity by the penalty factor on the scoring prediction effect for three groups of learners:
For active learners, the average difficulty of the educational video resources recommended to them is 0.1163-0.1399 higher than the average difficulty of the videos they actually watched. This result is reasonable for highly capable active learners, and the proposed personalized exploration strategy provides more difficult learning content to stimulate their potential and explore their highest point of ZPD. For potential learners, due to their higher ability, the difficulty of the recommended videos is also slightly higher than that of the videos they have actually learned, but the difference between the higher difference is about 0.0662-0.0935, which is lower compared to the difficulty difference of the recommended educational videos for active learners. The reason for this is that potential learners watch fewer educational videos, and slightly increasing the difficulty can effectively motivate them and encourage them to participate more in learning. Therefore, slowly increasing the difficulty is conducive to maximizing their potential. For inactive learners, a very small number of recommended educational videos were slightly more difficult, and most were still within their ability. Overall, the difference in difficulty ranges from -0.0173 to 0.0191, which is consistent with their own weaker abilities. The difficulty of the learning resources should not be too different from the learners’ ability, otherwise it may cause cognitive load.

The students’ adaptability to the recommended algorithm
Finally, in order to verify the adaptive effect of the recommendation algorithm in this paper, we select the active learners with the highest accuracy rate from the three groups of learner groups in order to further analyze the adaptive effect on the recommended content to prove that our recommendation scheme can recommend for learners with different abilities. The results of the comparison of the difficulty values between the actual viewing of the Civics education videos and the recommendations when the strength of the recommendation list N=20 in the cumulative hit rate are shown in Figure 8 below. It can be seen that the difficulty of the educational videos recommended by the algorithm proposed in this paper is mostly higher or equal to the difficulty of the actual learning videos. The absolute value of the difficulty difference between the two is between 0.001-0.016 (see the shaded part in the figure), which controls the difficulty of the learning content well within the ability of the learners and provides some challenging content. Thus, the improvement of the personalized exploration strategy makes our proposed recommendation scheme well-adapted.

The difficulty of viewing and recommending the education video
This paper improves the collaborative filtering recommendation algorithm by introducing two penalty factors for attribute classification on the basis of similarity calculation of the traditional algorithm. Using hybrid recommendation algorithm resource similarity for rating prediction, in order to solve the traditional algorithm due to the “cold start” so that the rating data is too sparse caused by the decline in the quality of the recommendation.
Students are categorized into “A active learners, B potential learners and C inactive learners” according to the degree of their video clicking activity. The frequency of video hits of the three types of students is A (75.89 hits) > B (66.11 hits) > C (58.17 hits). The cumulative hit rate of students increased with the intensity of recommendation list (N=5, N=10, N=20). Cumulative hit rates for active learners and potential learners were greater than 95% for N=5, N=10, and N=20, while cumulative hit rates for inactive learners were less than 50%. The active learners consistently had the highest accuracy rates, followed by the potential learners and inactive learners with 97.43%, 83.77% and 63.16%, respectively. The more active learners, potential learners and inactive learners watch the video, the larger the F1 value (88.26%, 77.26% and 43.71%) and the better the model performs. The average difficulty of educational video resources for active learners, potential learners and inactive learners is higher than the average difficulty of the videos they actually watched by 0.1163-0.1399, 0.1163-0.1399 and -0.0173- respectively. 0.0191 between them, which is consistent with their own weaker ability. And the difficulty of the educational videos recommended by the algorithm proposed in this paper is mostly higher than or equal to the difficulty of the actual learning videos, and the collaborative filtering hybrid recommendation algorithm model has good adaptability.