With the popularisation of education, the number of college students is increasing day by day, and there are also more students with psychological health problems. Whether students’ psychological abnormalities can be detected in time is one of the main problems faced by colleges and universities at present. Adopting digital technology to mine, collect and analyse the data generated by psychological health education in colleges can effectively solve the dynamic development of students’ psychological health problems. Therefore, in this paper, the psychological health problems of college students are identified and classified by establishing an improved logistic regression model. The behaviour characteristics are quantified and the differences are combined according to students’ relationships with their classmates, life rules and economic conditions. The test results show that the regression effect of the model is excellent, which can identify college students’ psychological health problems and improve the intervention and treatment of educators on students’ psychological problems.
Keywords
- logistic regression model
- psychological health problems
- data mining
With rapid development of the knowledge economy and the popularisation of higher education, the number of college students is increasing day by day, and there are also more students with psychological problems. At present, the main means of investigating students’ psychological health problems in colleges is through paper or online questionnaires. For the convenience of later tracking, students are generally required to leave more detailed personal information. However, due to considerations of personal privacy, many students are worried that they will be given special treatment or be labelled especially when filling out the questionnaire, which cannot objectively reflect the true psychological status, and leads to false information in most feedback [1,2,3]. In psychological research, big data technology has had a profound impact on the research logic, research methods and research tools. In the traditional research of psychological health education, the difficulty of data statistics limits the in-depth development of related research to a certain extent [4]. The arrival of big data technology has expanded thinking, innovated research platform and easily solved the problem of untoward data collection. Therefore, it is urgent to mine, collect and utilise the data generated by psychological health education in colleges, and combine big data technology with psychological health education in colleges and universities.
At present, researchers have made an in-depth analysis of the source, causes and countermeasures of college students’ psychological health, which is of vital significance for finding the abnormal psychological problems of college students in time and providing scientific theoretical support for their psychological health education [5,6,7]. At the same time, all colleges and universities are trying to equip full-time psychology teachers to undertake the tasks of teaching and psychological consultation of psychological health courses. Although they attach great importance to psychological health education, owing to the lack of professionals, digital technology cannot be effectively used, and thus psychological health education cannot be implemented efficiently [8, 9]. In addition, students’ psychological health is a process of dynamic development. However, there are problems such as insufficient attention to evaluation, weak teachers and imperfect evaluation system in most colleges [10], so it is difficult to capture the dynamic psychological development and changes in students during their study. Meanwhile, with the help of digital technology, under the condition of ensuring the scientificity, stability and sensitivity of indicators, establishing a dynamic identification and evaluation system for students’ psychological problems can effectively promote the development of college students’ psychological health.
In the psychological health identification model, on the basis of analysing and summarising numerous data of students, deeper information and features are extracted, which is to identify whether students have a tendency of psychological abnormality through the data of students’ behaviour and other data in college, thus guiding the direction for college teachers and psychological education staff.
Data mining is a complicated process. Through a series of calculations in a large amount of data, it is necessary to find the potential and hidden internal relations in the data and extract valuable information. The specific process is shown in Figure 1:
Selection and preparation of data set. According to the actual demand, select the initial data set and collect as much relevant data as possible. The more relevant the data, the higher the accuracy, but the amount of calculation will also increase [11]. Data preprocessing. The collected data may contain some isolated points and discrete points, so it is necessary to preprocess the data to make it useful. Methods of data preprocessing include data cleaning, transformation, integration and specification [12]. Data mining. Select appropriate methods and models for data mining, and find the intrinsic relationship between data and extract hidden valuable information, according to the data characteristics after data preprocessing and the actual needs of users, Analysis and application of results. Analyse the obtained results in connection with the actual situation, and if it is not applicable to the actual situation, the above steps need to be repeated.
Fig. 1
Data mining process

Logistic regression is a kind of data mining technology. For a given data set, it can get a nonlinear model through linear model transformation to predict the actual test value as much as possible and output it [13, 14]. Actually, it is a classification model. Compared with other algorithms, logical regression has a simple form, strong interpretability and fast training speed, whose main idea is to fit the decision boundary as much as possible and output the predicted value. On the basis of linear regression, that is, for multivariate input
Sigmoid function is usually selected as the mapping function, and its expression is:
Introduce Formula (1) so that:
The value of
It can be seen that through sigmoid the function can define the domain as
Regarding
Due to the serious data imbalance between normal and abnormal samples, classification without other operations will affect the results of the model [15]. Therefore, before the dichotomy, it is necessary to balance the data, and combine their features with the differences.
The samples of normal students are marked as negative examples, whose code is 0, and the samples of abnormal students are marked as positive examples, whose code is 1. Before training the model, in order to avoid the influence of dimensional units between feature samples, the data should first be standardised. Data standardisation refers to scaling all feature data to a fixed range to avoid the influence of different units and different numerical values when training the model. Similarly, because there is a large numerical span in the features selected in this experiment, in order to eliminate the influence of numerical value on the classification model, the data should be normalised at first. That is, the result is mapped to [0–1] by linear transformation.
For sample
According to this formula, the range of sample data can be compressed to [0,1], so as to eliminate the differences among different feature samples and avoid the influence of numerical dimension among feature samples.
The logistic regression model has an important influence on the results of sample classification, so the parameters of the logistic regression model are optimised. According to the transformation of sigmoid function, the assumed function is:
Among them,
To prevent the model from over-fitting, a regularisation term is added to the loss function and is represented as:
Introducing features into the model:
In this paper, the model updated with the above parameters is used to predict students’ psychological problems, in which
In the mining of college education data, experts and scholars often pay attention to the characteristics related to students’ achievements, ignoring students’ psychological activities [16]. Therefore, based on the five-factor model [17,18], this paper extracts and quantifies the behavioural characteristics of students’ psychological health problems on the basis of a large amount of original data. By referring to psychology, pedagogy and other related knowledge, indicators are established to extract students’ characteristics, and perfect and complete behavioural characteristics are constructed relatively to measure students’ psychological health problems. Based on the characteristics of responsibility and extraversion of the five-factor model, the behaviour of students is quantified from three perspectives: students-classmate relationship, regularity and economic situation.
Most of the students’ studies in colleges and universities are closed or semi-closed, so their health problems can be reflected through their classmates. The co-occurrence times of two students are directly proportional to the probability of becoming friends, therefore, association rules can be used to determine the classmate relationship.
In order to obtain students’ friends at college, it is necessary to calculate the co-occurrence data set between pairs of students, which is represented as:
In order to eliminate contingency, if the co-occurrence times of student A and student B are greater than T, the two students are considered as friends. Considering that the threshold value of each location is related to the total number of times, it is defined as:
Among them, for students
Form to explore the relationship, that is
where
Confidence is defined as:
Among them
Where
The regularity of students’ behaviour reflects their self-discipline and orderliness in college. Students with strong regularity have better binding force, and the regularity of students’ behaviour is closely related to their academic performance [19,20,21]. Therefore, it can be considered that students with strong behavioural regularity can arrange their own plans in life. In this paper, the regularity of students’ behaviour is mainly quantified, such as eating and bathing, and whether there are significant differences between students with abnormal psychology and normal students in the regularity of students’ campus life behaviours, are explored. Shannon entropy is used to calculate and measure, and its definition is as follows:
Here
where
Studies [23] have shown that the psychological status of college students is influenced by the family economic situation, and many students from poor families will suffer from long-term inferiority and depression; also they seem to be unwilling to communicate with others and other behaviours. However, due to the limitation of conditions, it is impossible to know the family status of students’ original families. In order to explore whether there are significant differences in the college performance between psychologically normal and abnormal students, in this paper, students’ financial situation is measured from the students’ financial aid and their consumption at college.
Given the time and amount of students’ consumption each time, first of all, the annual consumption data of 2,000 students (436,044 items in total) were processed and analysed, and the statistics of the average consumption of two kinds of samples between one semester and 1 year showed that there was no obvious difference in data distribution. Through Wilcoxon S hypothesis test, the zero hypothesis is as follows: There is no significant difference in consumption level between normal students and psychologically abnormal students. The alternative hypothesis is that there are significant differences in consumption level between normal students and students with psychological disorders. It is found that under the condition of 0.05 confidence, the consumption level of the two groups is
According to the above characteristics, the quantitatively extracted sample data is introduced into the improved logistic regression model obtained before, which can effectively reflect the psychological health problems of college students. The evaluation index of this model is defined as:
Accuracy, that is, the proportion of correctly predicted samples to all samples. In the confusion matrix,
Precision, which refers to the proportion of correctly predicted positive samples (
Recall represents the proportion of correctly predicted positive samples (
In this model, for the positive sample in the original data tag, if it is predicted as a positive example, the prediction result of this sample is a True Positive example, or TP in short. On the contrary, if it is predicted as a negative example, the result is a False Negative example. Similarly, a False Positive example (
The original sample ratio of this model is about 10:1, as shown in Figures 2–4; if it is directly used to train the model, the logistic regression effect of the sample data is poor, but with the approaching of positive and negative proportion, the accuracy, precision and recall of the sample are better. When the positive-negative ratio of the sample is 1:2, the model regression effect is the best, with the accuracy, precision and recall reaching 76.2% and 83.4%, respectively. At the same time, when the positive–negative ratio of the sample is less than 1:2, the regression effect of the model is somewhat reduced.
Fig. 2
Comparison of accuracy

Fig. 3
Comparison of precision

Fig. 4
Comparison of recall

In addition, it is necessary to identify students with psychological disorders in the shortest time. For this reason, the data sets of one semester, 1 year, are selected for comparison. The data set used in this model is for students in the second semester of sophomore year (one semester), sophomore year (1 year) and college stage (freshman to junior year; there is no data for senior students here, because there are few exam classes for senior students in our college, and most of the students are in internship outside during their senior years). Feature training model as previously described is used to classify these data.
The recall rate indicates the proportion of the number of samples correctly predicted as positive cases in the sum of all samples divided into positive cases. The purpose of the experiment in this paper is to identify students with abnormal psychology among all students, that is, to identify a relatively large number of positive examples, so recall rate is an important evaluation index for this model. The higher is the recall rate, the more students with abnormal psychology are identified by the model. The results show that the improved logistic regression model performs well when adopting all 3-year data. In addition, the recall rate obtained by using 1 year's data is slightly lower than that obtained by using all the data. Combined with the classification accuracy, it can be considered that it takes at least 1 year's student behaviour data to identify students with psychological disorders, and the more data used, the better the robustness of the logistic regression model.
Researching the dynamic evaluation of college students’ psychological health based on digital technology can provide new research methods and theoretical practice for the follow-up psychological health education in colleges and universities. Therefore, based on the logistic regression model in data mining, by combining the characteristics of responsibility and extraversion of personality theory, students-classmates relationship, life rules and economic condition are quantified. Then, through the differential combination of data about students’ behaviour characteristics, an improved logistic regression model is constructed to classify and predict students’ psychological health problems. Finally, accuracy, precision and recall are used as the evaluation indexes of the model. The results show that when the positive–negative ratio of the sample is 1:2, the regression effect of the model is best with the accuracy, precision and recall reaching 76.2%, 83.4% and 86.6%, respectively. Meanwhile. more than 1 year's data of students’ behaviour is more conducive to identifying students’ psychological abnormalities, and the more data adopted, the better the robustness of the logistic regression model.
Fig. 1

Fig. 2

Fig. 3

Fig. 4

Law of interest rate changes in financial markets based on the differential equation model of liquidity Basalt fibre continuous reinforcement composite pavement reinforcement design based on finite element model Industrial transfer and regional economy coordination based on multiple regression model Satisfactory consistency judgement and inconsistency adjustment of linguistic judgement matrix Spatial–temporal graph neural network based on node attention A contrastive study on the production of double vowels in Mandarin Research of cascade averaging control in hydraulic equilibrium regulation of heating pipe network Mathematical analysis of civil litigation and empirical research of corporate governance Health monitoring of Bridges based on multifractal theory Health status diagnosis of the bridges based on multi-fractal de-trend fluctuation analysis Performance evaluation of college laboratories based on fusion of decision tree and BP neural network Application and risk assessment of the energy performance contracting model in energy conservation of public buildings Sensitivity analysis of design parameters of envelope enclosure performance in the dry-hot and dry-cold areas The Spatial Form of Digital Nonlinear Landscape Architecture Design Based on Computer Big Data Analysis of the relationship between industrial agglomeration and regional economic growth based on the multi-objective optimisation model Constraint effect of enterprise productivity based on constrained form variational computing The impact of urban expansion in Beijing and Metropolitan Area urban heat Island from 1999 to 2019 TOPSIS missile target selection method supported by the posterior probability of target recognition Ultrasonic wave promoting ice melt in ice storage tank based on polynomial fitting calculation model The incentive contract of subject librarians in university library under the non-linear task importance Application of Fuzzy Mathematics Calculation in Quantitative Evaluation of Students’ Performance of Basketball Jump Shot Visual error correction of continuous aerobics action images based on graph difference function Application of Higher Order Ordinary Differential Equation Model in Financial Investment Stock Price Forecast Application of Forced Modulation Function Mathematical Model in the Characteristic Research of Reflective Intensity Fibre Sensors Radioactive source search problem and optimisation model based on meta-heuristic algorithm Research on a method of completeness index based on complex model Fake online review recognition algorithm and optimisation research based on deep learning Research on the sustainable development and renewal of Macao inner harbour under the background of digitisation Support design of main retracement passage in fully mechanised coal mining face based on numerical simulation Study on the crushing mechanism and parameters of the two-flow crusher Interaction design of financial insurance products under the Era of AIoT Modeling the pathway of breast cancer in the Middle East Corporate social responsibility fulfilment, product-market competition and debt risk: Evidence from China ARMA analysis of the green innovation technology of core enterprises under the ecosystem – Time series data Reconstruction of multimodal aesthetic critical discourse analysis framework Image design and interaction technology based on Fourier inverse transform What does students’ experience of e-portfolios suggest Research on China interregional industrial transformation slowdown and influencing factors of industrial transformation based on numerical simulation The medical health venture capital network community structure, information dissemination and the cognitive proximity Data mining of Chain convenience stores location The optimal model of employment and entrepreneurship models in colleges and universities based on probability theory and statistics A generative design method of building layout generated by path Parameter Id of Metal Hi-pressure State Equation Analysis of the causes of the influence of the industrial economy on the social economy based on multiple linear regression equation Research of neural network for weld penetration control Intelligent Recommendation System for English Vocabulary Learning – Based on Crowdsensing Regarding new wave distributions of the non-linear integro-partial Ito differential and fifth-order integrable equations Research on predictive control of students’ performance in PE classes based on the mathematical model of multiple linear regression equation Beam control method for multi-array antennas based on improved genetic algorithm The influence of X fuzzy mathematical method on basketball tactics scoring Application of regression function model based on panel data in bank resource allocation financial risk management Research on aerobics training posture motion capture based on mathematical similarity matching statistical analysis Application of Sobolev-Volterra projection and finite element numerical analysis of integral differential equations in modern art design Influence of displacement ventilation on the distribution of pollutant concentrations in livestock housing Research on motion capture of dance training pose based on statistical analysis of mathematical similarity matching Application of data mining in basketball statistics Application of B-theory for numerical method of functional differential equations in the analysis of fair value in financial accounting Badminton players’ trajectory under numerical calculation method Research on the influence of fuzzy mathematics simulation model in the development of Wushu market Study on audio-visual family restoration of children with mental disorders based on the mathematical model of fuzzy comprehensive evaluation of differential equation Difference-in-differences test for micro effect of technological finance cooperation pilot in China Application of multi-attribute decision-making methods based on normal random variables in supply chain risk management Exploration on the collaborative relationship between government, industry, and university from the perspective of collaborative innovation The impact of financial repression on manufacturing upgrade based on fractional Fourier transform and probability AtanK-A New SVM Kernel for Classification Validity and reliability analysis of the Chinese version of planned happenstance career inventory based on mathematical statistics Visual positioning system for marine industrial robot assembly based on complex variable function Mechanical behaviour of continuous girder bridge with corrugated steel webs constructed by RW Research on the influencing factors of agricultural product purchase willingness in social e-commerce situation Study of a linear-physical-programming-based approach for web service selection under uncertain service quality A mathematical model of plasmid-carried antibiotic resistance transmission in two types of cells Burnout of front-line city administrative law-enforcing personnel in new urban development areas: An empirical research in China Calculating university education model based on finite element fractional differential equations and macro-control analysis Educational research on mathematics differential equation to simulate the model of children's mental health prevention and control system Analysis of enterprise management technology and innovation based on multilinear regression model Verifying the validity of the whole person model of mental health education activities in colleges based on differential equation RETRACTION NOTE Innovations to Attribute Reduction of Covering Decision System Based on Conditional Information Entropy Research on the mining of ideological and political knowledge elements in college courses based on the combination of LDA model and Apriori algorithm Adoption of deep learning Markov model combined with copula function in portfolio risk measurement Good congruences on weakly U-abundant semigroups Research on the processing method of multi-source heterogeneous data in the intelligent agriculture cloud platform Mathematical simulation analysis of optimal detection of shot-putters’ best path Internal control index and enterprise growth: An empirical study of Chinese listed-companies in the automobile manufacturing industry Determination of the minimum distance between vibration source and fibre under existing optical vibration signals: a study Nonlinear differential equations based on the B-S-M model in the pricing of derivatives in financial markets Nonlinear Differential Equations in the Teaching Model of Educational Informatisation Fed-UserPro: A user profile construction method based on federated learning The evaluation of college students’ innovation and entrepreneurship ability based on nonlinear model Smart Communities to Reduce Earthquake Damage: A Case Study in Xinheyuan, China Response Model of Teachers’ Psychological Education in Colleges and Universities Based on Nonlinear Finite Element Equations Institutional investor company social responsibility report and company performance Mathematical analysis of China's birth rate and research on the urgency of deepening the reform of art education First-principles calculations of magnetic and mechanical properties of Fe-based nanocrystalline alloy Fe80Si10Nb6B2Cu2 The Effect of Children’s Innovative Education Courses Based on Fractional Differential Equations Fractional Differential Equations in the Standard Construction Model of the Educational Application of the Internet of Things Optimization in Mathematics Modeling and Processing of New Type Silicate Glass Ceramics Has the belt and road initiative boosted the resident consumption in cities along the domestic route? – evidence from credit card consumption MCM of Student’s Physical Health Based on Mathematical Cone Attitude control for the rigid spacecraft with the improved extended state observer Sports health quantification method and system implementation based on multiple thermal physiology simulation Research on visual optimization design of machine–machine interface for mechanical industrial equipment based on nonlinear partial equations Research on identifying psychological health problems of college students by logistic regression model based on data mining Abnormal Behavior of Fractional Differential Equations in Processing Computer Big Data Mathematical Modeling Thoughts and Methods Based on Fractional Differential Equations in Teaching A mathematical model of PCNN for image fusion with non-sampled contourlet transform Nonlinear Differential Equations in Computer-Aided Modeling of Big Data Technology The Uniqueness of Solutions of Fractional Differential Equations in University Mathematics Teaching Based on the Principle of Compression Mapping Influence of displacement ventilation on the distribution of pollutant concentrations in livestock housing Cognitive Computational Model Using Machine Learning Algorithm in Artificial Intelligence Environment Application of Higher-Order Ordinary Differential Equation Model in Financial Investment Stock Price Forecast Recognition of Electrical Control System of Flexible Manipulator Based on Transfer Function Estimation Method Automatic Knowledge Integration Method of English Translation Corpus Based on Kmeans Algorithm Real Estate Economic Development Based on Logarithmic Growth Function Model Informatisation of educational reform based on fractional differential equations Financial Crisis Early Warning Model of Listed Companies Based on Fisher Linear Discriminant Analysis Research on the control of quantitative economic management variables under the numerical method based on stochastic ordinary differential equations Network monitoring and processing accuracy of big data acquisition based on mathematical model of fractional differential equation 3D Animation Simulation of Computer Fractal and Fractal Technology Combined with Diamond-Square Algorithm The Summation of Series Based on the Laplace Transformation Method in Mathematics Teaching Optimal Solution of the Fractional Differential Equation to Solve the Bending Performance Test of Corroded Reinforced Concrete Beams under Prestressed Fatigue Load Radial Basis Function Neural Network in Vibration Control of Civil Engineering Structure Optimal Model Combination of Cross-border E-commerce Platform Operation Based on Fractional Differential Equations Research on Stability of Time-delay Force Feedback Teleoperation System Based on Scattering Matrix BIM Building HVAC Energy Saving Technology Based on Fractional Differential Equation Human Resource Management Model of Large Companies Based on Mathematical Statistics Equations Data Forecasting of Air-Conditioning Load in Large Shopping Malls Based on Multiple Nonlinear Regression System dynamics model of output of ball mill Optimisation of Modelling of Finite Element Differential Equations with Modern Art Design Theory Mathematical function data model analysis and synthesis system based on short-term human movement Sensitivity Analysis of the Waterproof Performance of Elastic Rubber Gasket in Shield Tunnel Human gait modelling and tracking based on motion functionalisation Analysis and synthesis of function data of human movement The Control Relationship Between the Enterprise's Electrical Equipment and Mechanical Equipment Based on Graph Theory Financial Accounting Measurement Model Based on Numerical Analysis of Rigid Normal Differential Equation and Rigid Functional Equation Mathematical Modeling and Forecasting of Economic Variables Based on Linear Regression Statistics Design of Morlet wavelet neural network to solve the non-linear influenza disease system Nonlinear Differential Equations in Cross-border E-commerce Controlling Return Rate Differential equation model of financial market stability based on Internet big data 3D Mathematical Modeling Technology in Visualized Aerobics Dance Rehearsal System Children’s cognitive function and mental health based on finite element nonlinear mathematical model Motion about equilibrium points in the Jupiter-Europa system with oblateness Fractional Differential Equations in Electronic Information Models Badminton players’ trajectory under numerical calculation method BIM Engineering Management Oriented to Curve Equation Model Optimal preview repetitive control for impulse-free continuous-time descriptor systems Development of main functional modules for MVB and its application in rail transit Study on the impact of forest fire prevention policy on the health of forest resources Mathematical Method to Construct the Linear Programming of Football Training The Size of Children's Strollers of Different Ages Based on Ergonomic Mathematics Design Stiffness Calculation of Gear Hydraulic System Based on the Modeling of Nonlinear Dynamics Differential Equations in the Progressive Method Relationship Between Enterprise Talent Management and Performance Based on the Structural Equation Model Method Value Creation of Real Estate Company Spin-off Property Service Company Listing Selection by differential mortality rates Digital model creation and image meticulous processing based on variational partial differential equation Dichotomy model based on the finite element differential equation in the educational informatisation teaching reform model Nonlinear Dissipative System Mathematical Equations in the Multi-regression Model of Information-based Teaching The modelling and implementation of the virtual 3D animation scene based on the geometric centre-of-mass algorithm The policy efficiency evaluation of the Beijing–Tianjin–Hebei regional government guidance fund based on the entropy method The transfer of stylised artistic images in eye movement experiments based on fuzzy differential equations Research on behavioural differences in the processing of tenant listing information: An eye-movement experiment A review of the treatment techniques of VOC Some classes of complete permutation polynomials in the form of ( x p m −x +δ )s +ax p m +bx overF p 2m The consistency method of linguistic information and other four preference information in group decision-making Research on the willingness of Forest Land’s Management Rights transfer under the Beijing Forestry Development A mathematical model of the fractional differential method for structural design dynamics simulation of lower limb force movement step structure based on Sanda movement Fractal structure of magnetic island in tokamak plasma Numerical calculation and study of differential equations of muscle movement velocity based on martial articulation body ligament tension Study on the maximum value of flight distance based on the fractional differential equation for calculating the best path of shot put Sports intensity and energy consumption based on fractional linear regression equation Analysis of the properties of matrix rank and the relationship between matrix rank and matrix operations Study on Establishment and Improvement Strategy of Aviation Equipment Research on Financial Risk Early Warning of Listed Companies Based on Stochastic Effect Mode Characteristics of Mathematical Statistics Model of Student Emotion in College Physical Education Mathematical Calculus Modeling in Improving the Teaching Performance of Shot Put Application of Nonlinear Differential Equation in Electric Automation Control System Nonlinear strategic human resource management based on organisational mathematical model Higher Mathematics Teaching Curriculum Model Based on Lagrangian Mathematical Model Optimization of Color Matching Technology in Cultural Industry by Fractional Differential Equations The Marketing of Cross-border E-commerce Enterprises in Foreign Trade Based on the Statistics of Mathematical Probability Theory The Evolution Model of Regional Tourism Economic Development Difference Based on Spatial Variation Function The Inner Relationship between Students' Psychological Factors and Physical Exercise Based on Structural Equation Model (SEM) Fractional Differential Equations in Sports Training in Universities Higher Education Agglomeration Promoting Innovation and Entrepreneurship Based on Spatial Dubin Model