Learner Behavior Analysis and Optimization Strategies for Computer Network-Based Distance Education Platforms
Published Online: Mar 19, 2025
Received: Oct 15, 2024
Accepted: Feb 09, 2025
DOI: https://doi.org/10.2478/amns-2025-0485
Keywords
© 2025 Xuan Lin, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
More than 20 years after the reform and opening up, China’s new education system for the lifelong service of in-service workers has taken shape, training more than 20 million college and university degree holders for China, and providing job technical training and learning education for 40 million workers [1–3]. This rapid development is due to the rapid development of information technology, that is, after the 1990s, put forward with the characteristics of the background of the times, based on information and network technology, modern distance education. The basic form ensures a set of personalized learning and centralized learning as a whole, real-time teaching and delivery of educational resources as a whole, with the micro-courses, MOOC, SPOC, mobile learning and other concepts put forward, the form of teaching and learning from open courseware, open learning resources to the open teaching and learning process, providing a more flexible and personalized learning, however, due to the differences in the learners, different types of learners tend to learn to show different behavioral characteristics [4–6]. In order to improve the level of network education services and social satisfaction, it is necessary to provide more personalized education and teaching methods in response to the differentiated needs of learners [7–8].
Network education, also known as modern distance education, belongs to a kind of adult academic education, network education is a new concept produced after the application of modern information technology in education, that is, the use of network remote technology and the environment to carry out the education, is a kind of relative to the face-to-face education, the separation of teachers and students, non-face-to-face organization of the teaching activities, it is a kind of cross-school, cross-region education system and the teaching mode, the main features are that the students and the teachers are separated, a specific transmission system and communication media are used for teaching, the transmission of information is varied, and the place and form of learning are flexible and changeable [9–12]. The advantage is that it can break through the limitations of time and space, provide more learning opportunities, expand the scale of teaching, improve the quality of teaching, and reduce the cost of teaching [13–14].
Other ways to realize adult education include open education, adult college entrance examination and adult self-study examination [15]. In some policy documents issued by China, the term modern distance education has been replaced by network education. Although the development of distance education has been about 150 years, however, the late start in China, is dependent on the early 20th century correspondence education developed over nearly a century, the first half of the century, China is still in deep water, internal and external problems in the status quo, on the one hand, the urgent need for a large number of talented people to become an explorer of the salvation of the survival of the people, on the other hand, the country to be developed, the centennial revitalization of the plan is still indispensable to Teaching and educating people [16–19].
However, while distance education provides convenience, due to its rich educational resources, it is more difficult to give learners more targeted learning services in the learning process. Therefore, based on the learner learning behavior data, the analysis of learners with the help of technical means has become an important research content of the current educational data mining [20–23].
This paper analyzes the constituent elements of online learning behavior as well as the mode of operation, constructs an analysis model of learners’ online learning behavior, and visualizes the output of learning behavior through data collection and statistical analysis. In addition, this paper utilizes cluster analysis to divide learners according to the average value of behavioral characteristics, and uses lag sequence analysis to evaluate the differentiation of learning sequences among learners from different class groups. Finally, the link between learning behaviors and performance is examined, and strong association rules between learners’ online learning behaviors are uncovered.
According to the theories of behavioral science and activity theory, this paper analyzes the four major components of the structure of learners’ online learning behavior system. Analysis of the behavioral subject of network platform learning The main body of the behavior of network platform learning is naturally the learner, in the network learning space, the learner to the old knowledge and experience as the basis, take the initiative to accept and absorb, digest and understand the new knowledge of the Internet, new things, and re-construct their own knowledge structure. Analysis of the object of online platform learning behavior The object of learners’ online platform learning behavior refers to the various learning resources available on the online learning platform. Nowadays, there are many kinds of multimedia learning resources on the e-learning platform, including but not limited to video courses, electronic courseware, problem sets, etc., which are presented in front of the learners in different ways and basically meet the learners’ learning needs. Analysis of Learning Behavior Tools on Online Platforms Learners generally use three types of learning tools to accomplish e-learning: cognitive tools, efficacy tools, and communication tools. Cognitive tools refer to paper, pens, models, etc., which can improve the cognitive level of the learner and help the learner to think. Efficacy tools are tools such as search engines, learning software, etc. that help learners accomplish specific learning tasks. Communication tools are discussion forums, E-mail, QQ, etc., which help learners realize the communication and interaction with others. Behavioral community and its task division analysis Learners, teachers and other learners constitute a community of behavior in online platform learning, and the members of the community rely on rules to bind each other and on learning tasks to maintain the relationship between the parties. When the community conducts online learning, members exchange information and share resources with each other, follow the standards and rules formulated by the group, and collaborate to complete specific learning tasks after a clear division of labor.
Information query behavior
Refers to the behavior of learners who use Google, Baidu, and other search engines on their own to query what they need, obtain learning information, and seek answers when they encounter problems while conducting e-learning.
Information browsing and processing behavior
It refers to the behavior of learners browsing website information, checking the learning resources provided by the learning platform, filtering and selecting, saving and collecting, organizing and annotating the acquired information.
Information application and publishing behavior
It refers to the behavior that learners organize and process the information, apply it to specific practices and internalize it into their own knowledge and ability, and finally release their learning thoughts and tips and suggestions on the learning platform.
Information sharing and communication behavior
It refers to the behavior of learners sharing what they have learned through the learning platform, conducting real-time discussions and non-real-time exchanges with teachers or learners on course-related issues.
The analytical model of learners’ online platform learning behavior is shown in Figure 1:

Analysis model of college students’ online learning behavior
Firstly, the information provided by the learning platform of the network platform is incorporated into the learning behavior characteristics database, and then, after sufficient data mining, we analyze and explore the learners’ information logging and querying behaviors, information browsing and processing behaviors, information applying and releasing behaviors, and information sharing and communicating behaviors, so as to summarize the learners’ motivation for learning and to predict the learners’ learning styles, and then finally, we present the results of the analysis in a visualized way.
The flow of data analysis of learners’ e-learning behavior is shown in Figure 2. Data Collection Data collection in this study is divided into two main parts: static information collection and dynamic information collection. Static information is fixed data represented by learners’ basic information, such as name, class, gender and other unchanging information. Dynamic information is information that is constantly changing and updated in real time, covering a wider range. Static information reflects the beginning and end of things, while dynamic information reflects the process of events. After the data is collected, it should be stored in the database of learning behavior characteristics. Data statistics and analysis First, the collected data are extracted according to certain classification, automatically filter and eliminate some invalid information, and roughly organize, analyze and process the valid information. Then, data mining technology is utilized to carry out in-depth and detailed analysis of the retained effective data. From the platform access behavior data, course learning behavior data, quiz and test behavior data and discussion and interaction behavior data, the rules and characteristics of online learning behavior are summarized, and the learning style and psychology of learners are predicted. Visualization output The results of the analysis of learners’ online learning behaviors are output using visualization tools, which are presented to learners, teachers, and platform builders. Learners can reflect on themselves according to the results of behavioral analysis, adjust their learning methods and pace, and set clear learning goals and work towards them.

Statistical analysis flow of learners online learning behavior data
The process of clustering is to divide the data objects into different clusters based on whether they are similar or not, so that the data in the same clusters have maximum similarity and the data between different clusters have maximum difference.
K-means algorithm [24] The basic idea of division based clustering algorithm is that for a given sample data set, the sample points are divided into several clusters according to the size of the distance between the sample points, so that the distance between the sample points within the cluster is as close as possible, and the distance between the sample points between the clusters is as far as possible.
The K-means algorithm usually uses the Euclidean distance to measure the distance between sample points, which measures the absolute distance between point A and point B in space and is related to the coordinate values of the two points. The calculation formula is shown in equation (1):
Assuming that the sample set
In order to obtain the minimum value of the sum of squares of the error of the objective function, it is necessary to consider each kind of cluster division of the sample set D. The K-means algorithm is based on the greedy idea, and approximates the solution to minimize the above formula through iterative optimization. The main steps of the K-means algorithm are as follows:
Randomly obtain a certain number of sample points, determined as the initial center of each cluster. For each remaining sample point, calculate its distance from the center of each cluster and assign it to the cluster with the closest distance. Calculate the average value of each cluster as the new cluster center. Repeat steps 2) and 3) until the objective function converges.
Lagged sequence analysis [25] is a method of studying the sequential relationships between behaviors based on statistical theory. It is mainly used to test the probability of one behavior occurring after another in people and whether it is statistically significant or not, also known as sequence analysis.
The lagged sequence analysis process used in this study is:
Coding the different activity definitions. Coding the filtered behavioral data by definition. Generate coded sequences by time. Sequence analysis using software to obtain frequency conversion table as well as residual table. Behavioral transformation diagrams were obtained based on the residual tables.
In this study, the tool to be used for sequence analysis method is GSEQ5.
The research population selected for this study were learners of the online education platform on computers in the second year of university, with a total of 100 people being surveyed.
The data used in this study comes from course and learner behavior data recorded in LMS (Learning Management System) and CMS (Course Management System: e.g. online courses, learning forums) databases.
The steps of behavioral path transformation analysis are shown in Figure 3. The research process can be divided into three main parts, i.e., three parts: identifying the target of analysis, generating the sequence of learning behaviors, and visualizing the learning path.

Behavioral path transformation analysis step
The target learning behaviors include two types of activities: videos and assignments, and their attributes include the course sections to which the videos and assignments belong and the start and end times of the behaviors. Generating learning behavior sequences requires transforming the raw data into an analyzable time series. The analysis of learning path visualization includes the presentation of frequency, connectivity, and centrality of behavioral variables, and the interpretation of educational and pedagogical significance on this basis. This study further analyzes the special phenomena in the path diagram and makes relevant pedagogical suggestions for the design of courses and platforms.
The data processing approach of this study included four main steps:
Data selection Firstly, the relevant data of the two major categories of learning behaviors, video and homework, were screened, including learner id, activity number, start time, and end time, and a table of behavioral objects was generated, and data extraction was completed according to the mapping relationship on the right side of the figure. Coding Sorting Equation Frequency conversion and residual analysis The frequency conversion table was used to record the sequence of the two activity time nodes in the direction from the previous action to the next action, and the number of frequency conversions reflected the frequency of the activity and the relationship between the activities. The standardized value of the residuals was used to show the significance of the backward and forward transformations between the two behaviors:
Let
The general representation of association rules [26–27] is in the form of the relational implication equation of
Support and Confidence are two important metrics in association rules. Support refers to the probability that itemsets
Confidence is a measure of the strength of association rules, which refers to the probability that a transaction containing itemset
Where
Generating frequent itemsets and generating strong association rules are the main processes of association rule mining. The first stage is generating frequent itemsets, which firstly finds out all frequent itemsets that satisfy the minimum value of support from the database, and this stage can be divided into two sub-problems: the process of generating candidate itemsets and the process of generating frequent itemsets from the candidate itemsets. The second phase, i.e. Filtering out strong association rules that satisfy the confidence minimum based on the obtained frequent items. Compared to the two phases of association rule mining, the phase of generating association rules is straightforward and easy to operate, but the phase of finding frequent itemsets is more tedious as it requires searching all possible itemsets from the database. The item set that satisfies the minimum value of support and contains
The Apriori algorithm uses an important a priori property of frequent itemsets, i.e., if a particular itemset is frequent, all of its subsets that are not the empty set must belong to the frequent itemset, and the inverse of this property is very important, i.e., the superset of any infrequent itemset must be infrequent as well. The a priori property of frequent itemsets is schematically shown in Fig. 4, assuming that {A,B,C} is a frequent itemset, all its subsets {A}, {B}, {C}, {A,B}, {A,C}, {B,C} must also be frequent itemsets, and assuming that {D} is a nonfrequent itemset, all its supersets {A,D}, {B,D}, {C,D}, {A,B,D}, {A,C,D}, {B,C,D}, {A,B,C,D} are also infrequent term sets.

Frequent itemset prior properties
Apriori algorithm in generating frequent itemsets, will be calculated based on the results of the previous time, that is, constantly layer by layer to find the frequent itemsets, with
The main process of the connection step of the Apriori algorithm: firstly, the database of things is scanned, and each item is a member of a candidate item set, i.e., the database of things to be processed is a candidate item set
The main process of pruning step of Apriori algorithm: Before generating
This section provides a diagnostic analysis of online behavioral data of learners in online platforms from a learning behavior analysis perspective. By counting the frequency of learner behaviors occurring in the online learning platform, it explores how learners engage in the online learning process. Table 1 displays the description and coding of the learning behaviors of learners on the educational platform.
The learner is teaching the platform learning behavior description and coding
Behavior name | Behavior description | Coding |
---|---|---|
Video viewing | Check the learning section video | W |
Resource download | Download learning resources | D |
Sign in | Complete the online check-in | S |
Submit homework | Submit multiple assignments | K |
Chapter test | Complete chapter test | T |
Reading notice | Notice of visit | N |
Posting | Students publish personal opinions | F |
Reply | Reply to peer posts | H |
Check the teacher’s reply | Review the teacher’s comments on homework and post content | J |
Figure 5 shows the frequency of learner behavior sequences based on the distance education platform. As can be seen from the figure, there are five behavioral sequences with a frequency of 0 in the actual context, which are “SS”, “KK”, “FS”, “FN”, “KK”, “FS”, “FN”, and “JN”, so a total of 76 effective learner behavior sequences are extracted, among which the frequency of watching learning videos (WW) through distance education platform is the highest, with a statistical result of 1,264 times.

The learner behavior sequence frequency
In order to increase the efficiency of independent learning and improve the learning effect, on the basis of the above model, this paper carries out clustering analysis for learning groups. In this study, the K-means clustering algorithm was used to analyze nine behavioral features after screening. Five clusters were obtained, and the results of learner behavior clustering are shown in Table 2.
Learner behavior clustering results
Categories |
Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 | Cluster 5 |
---|---|---|---|---|---|
W | 1437.225 | 1330.869 | 1225.081 | 1121.005 | 1014.221 |
D | 254.136 | 225.058 | 209.972 | 198.936 | 184.114 |
S | 38.996 | 32.829 | 23.771 | 21.509 | 11.756 |
K | 328.712 | 319.071 | 312.51 | 301.033 | 282.053 |
T | 191.837 | 183.081 | 178.475 | 176.841 | 171.4 |
N | 84.579 | 79.429 | 72.129 | 67.772 | 60.564 |
F | 72.308 | 66.948 | 57.793 | 48.347 | 40.028 |
H | 87.997 | 79.689 | 72.843 | 61.814 | 53.632 |
J | 62.821 | 57.78 | 51.925 | 44.821 | 37.643 |
In order to better analyze the learning effects of different clusters of learners, this paper analyzes the learning effects of learners using the distance education platform. The results of the comparison of the learning effect of each cluster can be seen according to Figure 6. As can be seen from the figure, the upper limit, median and mean of the learning effect of each cluster are decreasing in the order of Cluster 1, Cluster 2, Cluster 3, Cluster 4 and Cluster 5. Therefore, the e-learning groups in this paper are categorized into the following five types: excellence learners (Cluster 1), positive learners (Cluster 2), average learners (Cluster 3), procrastination learners (Cluster 4), and negative learners (Cluster 5). The excellence learners are analyzed in detail as an example.

The results of the study effect were compared
There are 13 learners of the excellence type, accounting for 13% of the total number of learners, and their learning effect is higher than that of other types. As can be seen from Table 2, the values of several behavioral characteristics of the excellence type learners are higher than those of other types, which shows that the e-learning behaviors of the excellence type learners are positive and active, especially in the system interaction dimension. Excellent learners have good self-discipline, can actively complete various e-learning tasks, and obtain excellent learning results through their own efforts in learning. For excellent learners, e-learning resources should be enriched, the depth of learning should be broadened, and learners should be guided to a broader and deeper level of development.
After the overall analysis of the correlation between learning behavior sequences and academic performance, lagged sequence analysis was further used to explore the behavioral changes in the learning stages of non-learners and to decipher the differentiated behavioral patterns between the excellent learning group and the intermediate learning group.
The study set the course into three learning stages: pre-D1 (12 weeks), mid-D2 (5-10 weeks) and post-D (12-16 weeks), and calculated the adjusted residual values of behavioral sequences of the excellent learning group and the intermediate learning group. In order to present the behavioral sequence conversion patterns more intuitively, the significant behavioral data were plotted as conversion diagrams, and Figures 7 and 8 show the behavioral conversion diagrams for the three stages of the excellent learning group and the intermediate learning group, respectively. Pre-course D1 phase Both groups had the Watch Video WW and View Video After Reading Announcement NW sequences, indicating that both groups tended to learn new knowledge and actively complete learning tasks in the pre-course period. The excellent learners had a clear sequence of submitting assignments after reading announcements NK and chapter quizzes NT, downloading resources after watching videos WD, taking chapter quizzes WT, checking announcements WN to make sure they completed the learning tasks, taking quizzes after checking the instructor’s responses JT, and re-taking quizzes after quizzing to improve quiz scores, which indicated that this group was actively constructing knowledge. DS check-in after downloading resources, WS check-in after viewing videos DS check-in after downloading resources, WS check-in after watching videos, and TS check-in after chapter quizzes (teachers often post pre-quizzes), indicating that learners enter the classroom early and prepare for teaching. And checking the instructor’s response KJ after submitting assignments, checking the announcement SN after signing in, and checking the announcement TN after chapter quizzes indicate that learners have good study habits. Mid-course D2 stage Compared to the pre-course D1 stage, the variety of behavioral sequences of the two groups in the D2 stage increased significantly, with WK (submitting homework after watching the video), WT (completing the chapter quiz), NT (taking the quiz after reading the announcements) for completing the learning tasks, WH (replying to the teacher’s reply after watching the video), SH (replying to the teacher’s reply after checking in) for communicating and interacting, and TT (re-taking the quiz after the chapter quiz), TN (viewing the announcements after the chapter quiz) for proactively consolidating the knowledge, indicating that the learners have good learning habits, indicating that the vast majority of learners have good learning status. Meanwhile, the HS value increased significantly, indicating that there were some irrelevant discussions before the three-stage behavioral transitions lesson for the intermediate learner group. Late D3 stage of the course By the D3 stage in the late stage of the course, the learners’ overall frequency of watching video WW decreases, which coincides with the phenomenon of higher online learning attrition rate in distance learning. Meanwhile, the SF and SH of the two groups improve significantly, indicating that the teacher organizes more discussion and interactive communication activities in the late stage of the course, while the learners’ KS and TS frequencies are higher, which indicates that the continuity of course learning is better.

Three-stage behavioral transformation of the excellent learning group

Three-stage behavioral transformation in the medium learning group
Correlation analysis relies on correlation coefficients to assess the closeness of the linear relationship between variables. Since learning behaviors and grades are both ordinal variables, Spearman’s rank correlation coefficient is used to measure the linear correlation between each behavioral variable and academic performance, and the results of the correlation between learning behavior variables and academic performance are shown in Table 3.
Learning behavior variables and learning results
Behavior name | Correlation | Significance |
---|---|---|
Video viewing | 0.689*** | 0.000 |
Resource download | 0.405** | 0.004 |
Sign in | 0.075 | 0.137 |
Submit homework | 0.616*** | 0.000 |
Chapter test | 0.561*** | 0.000 |
Reading notice | 0.151** | 0.007 |
Posting | 0.231* | 0.023 |
Reply | 0.216* | 0.019 |
Check the teacher’s reply | 0.225* | 0.031 |
Note: are p significant at 0.05, 0.01, 0.001 level respectively
The table’s analysis reveals that the nine learning behaviors are positively correlated with the examination results, which suggests that e-learning can help learners improve their course grades to some extent. The three indicators with greater impact on learning achievement are, in order, video viewing (0.689), submission of homework (0.616), and chapter test (0.561), and the significance indicators of these three learning behaviors and learning achievement are all p of 0.000, i.e., they have passed the test of significance at the level of 0.001. This provides some guidance for teachers conducting online teaching to accurately understand learners’ learning and conduct more accurate evaluation. The correlation between platform check-in and learning achievement is only 0.075, and the significance indicator p is greater than 0.05, indicating that this behavioral indicator has an impact on achievement but is not significant.
In correlation analysis, the data used are basically discrete data or symbols with specific meanings. And the behavioral data exported from the platform are all quantitative attributes, which need to be discretized. Pairing the number of published logs at 0.3 with a minimum confidence level of 0.3, a dependency graph for each learning behavior is obtained as shown in Figure 9.

Dependency plot between each learning behavior
From the results in Figure 9, we can see that “watching course teaching video” and “checking teacher’s reply” are associated with “downloading learning resources”, i.e., learners “watch teaching video” and “checking teacher’s reply” are associated with “downloading learning resources”. That is, learners’ “watching the teaching video” and “checking the teacher’s reply” have an effect on “downloading resources”. Posting personal opinion has a direct impact on “Chapter test”, while the rules related to “Submit coursework” include “Entering the platform to sign in”, “Watching the teacher’s reply” and “Downloading resources”. The association rules related to “submit course assignments” include “sign in to the platform” and “watch teaching videos”, indicating that learners have to submit assignments when they sign in to the online course, and to a certain extent, they also have to complete assignments when they watch the teaching videos, and the strong association rules between e-learning behaviors are shown in Table 4.
Strong correlation rules between online learning behaviors
Foreterm | Afterterm | Confidence |
---|---|---|
Viewing course teaching Video number≥20 | Number of assignments=50~59 | 0.943 |
Number of signing in≥50 | Number of assignments≥60 | 0.784 |
Number of signing in≥50 | Number of publish personal views≥60 | 0.861 |
Number of publish personal views≥60 | Number of signing in≥50 | 0.886 |
As shown in Table 4, learners who “watched the teaching video of the course more than or equal to 20 times” have a 94.3% probability of completing the assignments properly. Learners who “check in to the platform more than or equal to 50 times” have 78.4% probability of “completing assignments” and 86.1% probability of “expressing personal opinions”. The probability of “expressing personal opinions” on the “number of check-ins to the platform” is 88.6%. This shows that if teachers purposefully assign more homework and ask learners to publish their learning experience and learning summaries during online lectures, it will inadvertently increase the frequency of learners’ access to the online courses and the number of times they learn from the course teaching videos, and gradually make learners develop the habit of online independent learning and improve their learning efficiency and effect.
This study utilizes multiple learner behavior mining algorithms to generate learning behavior sequences, as well as visual analysis of learning paths based on behavioral analysis models. It aims to explore the behavioral features and optimization strategies of learners on online platforms.
In this paper, nine kinds of behavioral features obtained by recording are combined to get 76 effective behavioral sequences. Based on the learners’ behavioral characteristics, they are classified as excellent, positive, average, procrastinating, and negative. Learners from different groups tend to watch videos in the early stage. The number of behavioral sequences increases significantly in the middle stage, and more learning behaviors are discussed in the later stage. The effects of video viewing, homework submission, and chapter tests on academic performance passed the significance test at the 0.001 level, with correlation coefficients ranging from 0.561 to 0.689. There is an interdependence between learning behaviors, for example, the strength of correlation between “watching course videos more than or equal to 20 times” and “submitting assignments 50-59 times” is 0.943.