Exploring the use of big data technology in mental health monitoring and intervention

With intensified social competition, increased occupational pressure, and changes in the living environment, people face multiple psychological pressures, leading to an upward trend in the incidence of psychological problems such as depression, anxiety, and obsessive-compulsive disorder [1-3]. Effective mental health monitoring and intervention can help people understand their mental health status more comprehensively, including their emotions, stress, and interpersonal relationships, and it provides effective ways to help them seek mental health help and support [4-7].

The problems people experience with their mental health are often complex and multifaceted, but early mental health professionals struggle to diagnose and treat them quickly and accurately. With the continuous development of technology, big data has gradually become an important part of today’s society [8-11]. Among many fields, big data application in mental health has also gradually become a research hotspot. With its higher data processing capacity and better analyzing ability, big data can greatly improve the diagnosis and treatment efficiency in the field of mental health [12-15]. Through the analysis of big data, all aspects of people’s behavior, emotions, and thinking can be monitored and analyzed in real time. For example, big data can be used to collect a large amount of psychological data through people’s search history, social media interactions, and phone records [16-19]. These data can reflect people’s interests, preferences, mood swings and other aspects of the situation, and then analyze the relevant data when people’s conditions arise, helping professionals to quickly and accurately diagnose and treat [20-23]. In addition, big data can also be used to study the direction of epidemiology in the field of mental health. Through the collection and analysis of a large amount of data, mental health diseases can be analyzed, studied, and then determine the response strategy in the field of mental health, so that the development of research in the field of mental health is richer, and the therapeutic effect is more accurate [24-27].

Literature [28] suggests that increased access to and use of digital technologies can effectively facilitate research and interventions in mental health, demonstrating that digital technologies have changed the way mental disorders are detected, treated, and prevented, advancing measurable improvements in mental health outcomes. Literature [29] examines Aml in conjunction with AI for mental health tracking and intervention, describes the basic concepts of Aml and its application to mental health monitoring and proactive intervention, and emphasizes the need to interpret AI models to ensure ethical use and fairness in AI diagnosis. Literature [30] discussed the current status of e-mental health systems for the prevention and treatment of major depressive disorder, proposed e-mental health software for the prevention or treatment of major depressive disorder, constructed the software system, functionality, and other aspects of the software, and revealed the technical limitations. The above study examines the application of information technology in in mental health and its positive impact, and also lays the foundation for the application of big data.

Mental health is a social issue that needs to be paid great attention to, and by adopting big data technology to strengthen the construction of mental health service system with effective preventive and intervention measures, it can help more people to maintain good mental health and improve the quality of life. Literature [31] analyzes the method of monitoring and intervening in the mental health of the elderly based on big data and proposes personalized interventions, revealing the effectiveness of the proposed method, which is able to identify the social activities, emotional state and other aspects of the elderly, providing a basis for the prediction and intervention of mental health problems. Literature [32] incorporates a fuzzy augmented predictive neural system in the assessment and intervention of college students’ mental health, which is able to deal with the complexity of mental health data in the context of big data analytics, and effectively identifies the mental health problems of college students, and performs well in the assessment and intervention of mental health. Literature [33] considered the application of big data technology in the early warning and intervention of college students’ psychological crisis, and dynamically understood students’ mental health status by establishing effective psychological crisis early warning and intervention strategies, etc., in order to effectively carry out mental health education. Literature [34] points out the inadequacy of the current mental health intervention platform and system in colleges and universities, and puts forward the application of big data technology in mental health, which can realize the depth of student data information mining data information, and effectively improve the efficiency and effect of psychological intervention in colleges and universities. Literature [35] analyzed the application of big data in the field of mental health, exploring the accuracy, diversity and security of the collected data. Indicating that the application of big data in mental health disorders is currently immature and still requires a great deal of effort.

The frequent occurrence of psychological crisis events exposes the inadequacy of the existing monitoring system, while the traditional methods relying on manual interviews or scale measurements have many shortcomings such as poor timeliness. In this paper, we utilize big data technology to study new methods of mental health analysis. Based on the information entropy and D-S evidence theory, we construct a combined keyword extraction model for Chinese psychological text, breaking through the granularity limitation of traditional lexical tools. Design an automated sentiment lexicon expansion method to enhance the sentiment word coverage ability of psychological text corpus. Through empirical analysis and comparative experiments, verify the practical value of the model and dictionary in the monitoring and intervention of college students’ mental health problems, and judge the performance level of the model.

2

Combined keyword extraction and sentiment dictionary construction

In this part, a dual-path model integrating Chinese combined keyword extraction and dynamic expansion of sentiment lexicon is constructed to address the limitations of mental health text analysis. The combined keyword extraction model quantifies intra-word cohesion through average mutual information, evaluates boundary independence by combining left and right neighbor entropy, and uses differential correlation matrix to explore deep semantic associations between words, breaking through the granularity limitation of traditional word segmentation. The automated expansion of emotion lexicon adopts TF-IDF algorithm to screen seed words, and calculates the point mutual information difference between words and positive and negative emotion words through SO-PMI algorithm to dynamically expand the emotion lexicon of the psychological domain, and solve the problem of limited coverage of emotion in the traditional participial words. The two synergistically improve the recognition accuracy of professional terms, enhance the coverage of sentiment analysis, and provide technical support for the subsequent quantitative analysis of college students’ mental health problems and the generation of intervention strategies.

2.1

Combined keyword extraction model based on information entropy and D-S theory

The current research on mainstream keyword extraction algorithms mainly focuses on English. For the English language, the English word segmentation is relatively easy because the space is used as the unit of each word segmentation. However, Chinese words are not divided by spaces, the relationship between words is complex and varied, and the existing word separation tools, often too fine a word granularity, but the Chinese keywords are often combined by two or more words, and the Chinese combination of keywords usually contains some technical terms, which is due to a number of reasons resulting in the extraction of keywords in Chinese keyword extraction accuracy rate is low. In this paper, we use mutual information to measure the degree of dependence between two consecutive words, and use the left and right information entropy to measure the uncertainty of word separation between consecutive words, so as to propose a combined keyword extraction model based on information entropy and D-S theory. Figure 1 shows the structure of the combined keyword extraction model. The model extracts new words by calculating the left-right information entropy and mutual information between two neighboring words in order to solve the problem of too fine granularity of current word-splitting tools and to be able to add paradigms to new words for screening. The depth structure existing between words is mined through the differential correlation matrix and the original word set is expanded by combining the new words, the score function is constructed to effectively derive the candidate keyword set of a given document, and finally, the D-S evidence combination is used to solve the problem of combining and splicing words covariantly in the candidate keyword set, and to determine the final set of combining keywords. The model mainly consists of the following three submodules: mutual information and left-right information entropy layer; differential correlation matrix layer; and D-S evidence combination layer.

2.1.1

Mutual information

Words, generally composed of two or more characters, usually have some association between these characters or strings, this association can be described by correlation, and as the correlation increases, the characters are more likely to form a word. When judging whether a word is formed, we need to judge the closeness of the combination between characters or strings, which can be measured by the degree of cohesion within the word, and the greater the degree of cohesion within the word, indicating that the closer the combination between the characters, the greater the relevance of the characters, and the greater the likelihood of the final composition of the word.

Mutual information (MI) is a statistic commonly used to express the degree of intra-word cohesion, which is defined as follows: for two discrete random variables X and Y, their degree of correlation is measured by I(X, Y). (1) $\begin{array}{rcl} M I (X, Y) & = & \sum_{x \in X} \sum_{y \in Y} p (x, y) \log \frac{p (x, y)}{p (x) p (y)} \\ = & E_{p (x, y)} \log \frac{p (x, y)}{p (x) p (y)} \end{array}$

where p(x, y) is the joint probability distribution function of random variables X and Y, p(x) and p(y) are the marginal probability distribution functions of X and Y, respectively, and MI(X, Y) is the mutual information of X and Y, i.e., the degree of association of X and Y.

Mutual information MI(X, Y) indicates how much the uncertainty of X decreases when the value of Y is known, i.e., how much information about X is contained in Y. Mutual information MI(X, Y) ≫ 0 indicates that X and Y are highly correlated; MI(X, Y) = 0 indicates that X and Y are independent of each other, i.e., X and Y do not provide any information about each other; and mutual information MI(X, Y) < 0, X and Y are called “complementary distributions”.

In neologism recognition, mutual information is often used to measure the tightness of the combination between two characters. The larger the mutual information MI(X, Y) is, the closer the combination between the two characters is, and the higher the probability of the final word is. This law also applies to measure the binding probability between strings and strings.

In practice, the length of the candidate words affects the results of mutual information, and the longer the candidate words are, resulting in larger mutual information results. Therefore, in order to better measure the degree of word cohesion, we introduce average mutual information.

Average Mutual Information (AMI), which is the average result of mutual information MI(X, Y) in two probability spaces X and Y, reflects the information size as a whole, and its formula is defined as follows: (2) $A M I (W) = \frac{M I (W)}{n} = \frac{p (W)}{n} \log (\frac{p (W)}{p (s_{1}) \dots p (s_{n})})$

where W is the candidate word, n is the length of candidate word W, and MI(W) is the amount of mutual information of the candidate word.

In the realization process of neologism recognition, in order to avoid the amount of mutual information being affected by the length of the candidate word, we use the average mutual information to measure the internal bonding degree of the neologism, the larger the average mutual information, the higher the internal bonding degree of the word, i.e., the more likely that a neologism will be formed.

2.1.2

Information entropy over neighborhood entropy

To determine whether two Chinese characters can form a word, in addition to the need to evaluate the degree of aggregation within the word, the diversity of neighboring characters of the word has to be one of the criteria we consider. In the following, we consider the concept of information entropy.

“Entropy” first appeared in thermodynamics, and was introduced into information theory in 1948, from which information can be quantified, and information entropy has been formally recognized by people since then, and has been slowly applied accordingly.

In information theory, information entropy refers to the amount of information contained in a message, reflecting the mean value of uncertainty of a random variable, i.e., the reduction of uncertainty about an event after hearing about it. For example, in the most common coin tossing event, we all know the possible outcomes, either one or the other. But before the coin toss, we do not know what the specific results of the experiment, once the results of the toss is determined, then the uncertainty of the event immediately becomes 0, the event reduces the uncertainty is the information entropy.

Measure the degree of uncertainty of an event or a random variable can choose the information entropy, where the larger the entropy, that is, the higher the uncertainty, want to accurately predict the probability of the random event will be lower, the more information is needed to assist in the prediction. Therefore, it is concluded that the greater the probability of occurrence of a random event, the smaller the uncertainty of the event, the smaller the amount of information brought about by the event, but always non-negative.

For discrete random variable X, the information entropy is calculated as follows: (3) $H (X) = - \sum_{x} p (x) \log p (x)$

Equation (3) shows that due to its cumulative relationship, the more the number of values of random variable X, the greater the information entropy H(X), indicating the greater the degree of confusion. When the distribution of the random variable is uniform, this is the most chaotic, and the information entropy H(X) reaches the maximum.

Assuming that we do not know the formula for information entropy, we would like to start from the nature of information entropy to infer what kind of functional form H(X) actually has. Before that, we might as well forget about the information entropy and focus only on the amount of information (also known as self-information). In this paper, we use I(x) to denote the amount of information conveyed when a random event X occurs.

From the above, it is clear that the relationship between the amount of information (self-information) and uncertainty should be monotonically increasing, so it can be represented by the probability distribution p(x) of the random variable X, the (4) $I (x) = f (p (x))$

We would like to get a formula for the measure of the amount of information from the properties of the amount of information, then it should have the following properties: 1)

Non-negativity, that is, the amount of information is greater than 0;

2)

Monotonicity, i.e., the amount of information shows a monotonic change as the probability of occurrence of random events increases, and this monotonic change is monotonically decreasing;

3)

Additivity, that is, for two independent random events X and Y, X and Y occur at the same time the amount of information contained should be equal to the sum of the information contained in X and Y occur alone.

This follows from Property 3) Cumulativity: (5) $I (x, y) = I (x) + I (y)$

And because: (6) $p (x, y) = p (x) p (y)$

So: (7) $f (p (x) p (y)) = f (p (x)) + f (p (y))$

Based on the form of the above public notice, it is possible to think of a logarithmic form contained in I(x). It is useful to set (8) $I (x) = q (x) \log p (x)$

where q(x) is the unknown function. Although the size of the base is not known, it is known to be monotonically increasing, and its base is omitted for the time being for the sake of brevity.

Expanding the above formula, we get (9) $\begin{array}{rcl} I (x y) & = & q (x y) \log (p (x) p (y)) \\ = & q (x) \log p (x) + q (y) \log p (y) \end{array}$

For the above equation to hold for any independent random events X and Y, it can only be that (10) $q (x) = q (y) = q (x y)$

So: (11) $q (x) = α$

where α is an arbitrary constant.

From this we get the expression for the amount of information (12) $I (x) = α \log p (x)$

Again, by the non-negativity of the informativeness, it follows that α < 0. This coefficient does not affect our measure of informativeness very much, and might as well be set to -2, with the same magnitude of the base.

From this we get the expression for self-information (13) $I (x) = - \log p (x)$

Intuitively, we find that the formula for self-information is the logarithmic value of the probability of a random event occurring. Suppose there is a random variable, which contains multiple random events, we want to know the amount of information that this random variable brings to us, but we do not know in advance what the value of this random variable is, we can only estimate it in advance, and the expectation of the random variable, which is the information now. The relationship between the amount of information (self-information) and information entropy is (14) $H (X) = E (I (x_{i}))$

2.1.3

Differential correlation matrix layer

Keywords can often reflect the center of the article and present a high degree of relevance to the content of the article, in this paper, we use the degree of differential correlation between words instead of the similarity between words and articles, and the differential correlation between words can eliminate the semantic redundancy between two words to a certain extent. In this paper, we utilize ltp for word embedding to get the differential correlation matrix, and extend the differential correlation matrix by combining the new words obtained from the mutual information and left-right information entropy layers. The poor correlation matrix layer is divided into three parts as follows: 1)

Word poor correlation

Each word and its structural relationship has an important role in document analysis. The stronger the structural relationship associated between words, the more the words contribute to the document. The correlation between two words is determined by the distribution of word frequencies in each paragraph, and the cosine similarity is used to calculate the similarity r and difference $R = r (1 - r)$ . In this paper, we propose the following formula for calculating the correlation $R (w_{i}, w_{j})$ between the ird word and the jth word: (15) $\begin{array}{rcl} R (w_{i}, w_{j}) & = & \frac{\sum_{p = 1}^{m} (w_{i}^{p} \times w_{j}^{p})}{\sqrt{\sum_{p = 1}^{m} {(w_{i}^{p})}^{2}} \times \sqrt{\sum_{p = 1}^{m} {(w_{j}^{p})}^{2}}} \\ \times (1 - \frac{\sum_{p = 1}^{m} (w_{i}^{p} \times w_{j}^{p})}{\sqrt{\sum_{p = 1}^{m} {(w_{i}^{p})}^{2}} \times \sqrt{\sum_{p = 1}^{m} {(w_{j}^{p})}^{2}}}) \end{array}$

Where m is the number of paragraphs in the document, $w_{i}^{p}$ and $w_{j}^{p}$ indicate the frequency of the ith and jth words respectively in the pth paragraph. The higher the value of $R (w_{i}, w_{j})$ , the stronger the relationship between the ith and jth words, and on the contrary, the lower the value of $R (w_{i}, w_{j})$ , the relatively weaker relationship between them.

Thus, the word difference correlation matrix measures the coupling between words throughout the document. Cosine similarity is a measure of similarity between two non-zero vectors in an inner product space, which is computed $R = r (1 - r)$ in this paper to take into account the complementarity between two words and also to mitigate the redundancy caused by absolute semantic similarity. 2)

Combining the differential correlation matrix of new words

After obtaining the original differential correlation matrix R and the new word set n_w, this paper extends the original differential correlation matrix to obtain the new differential correlation matrix R₁, and the new word differential correlation formula proposed in this paper is as follows: (16) $R (w_{k}, w_{j}) = \frac{l 1}{l 1 + l 2} \times R (w_{k}, w_{j}) + \frac{l 2}{l 1 + l 2} \times R (w 2_{k}, w_{j})$

l1 is the length of the first word w1_k of the new word w_k, and l2 is the length of the second word w2_k of the new word w_k. Figure 2 shows how the differential correlation matrix R to R₁ changes at this point:

3)

Select the candidate keyword set

Construct the score function based on the new poor correlation matrix R₁ and the poor correlation value of w_i with other words, and set the weight adjustment parameters. The formula for constructing the score function in this paper is as follows: (17) $s_{w =} = λ \frac{\sum_{j = 1}^{k} R (w_{i}, w_{j})}{\sum_{k - 1}^{k} \sum_{j - 1}^{k} R (w_{j}, w_{h})} + (1 - λ) \frac{b (w_{i})}{\sum_{j - 1}^{k} b (w_{j})}$

Where $b (w_{i})$ is the total number of words related to w_i and λ is the weight adjustment parameter. The candidate keyword set is filtered by sorting through the high and low scores and the first Top_n words are taken out.

2.1.4

Feature fusion based on D-S evidence theory

The label path features in the label path feature family are independent of each other, i.e., for label path p, the label path feature TPL identifies whether it is a content path or not and the other label path features identify whether it is a content path or not without interfering with each other, which is a phenomenon that just meets the requirement of DS evidence theory that the evidence is independent of each other.

The identification framework $U = {{p_{1}}, {p_{2}}, \dots, {p_{n}}}$ is selected, where n denotes the number of labeled path types. ${p_{i}}$ $(1 \leq i \leq n)$ denotes a collection of labeled paths. Since labeled path features are computed in terms of one labeled path, there is only one labeled path in each collection of labeled paths. How to determine the basic probability assignment function mass function for each label path feature is the key problem in the label path feature fusion process. In this paper, we adopt the method of simple probability statistics to take the label path length feature as an example, and its basic probability assignment function m₁ is calculated as follows: (18) ${\begin{array}{l} m_{1} (F) = 0 \\ m_{1} ({p_{i}}) = \frac{T P L (p_{i})}{\sum_{i \leq j \leq n} T P L (p_{j}) + e} (1 \leq i \leq n) \\ m_{1} (U) = \frac{e}{\sum_{1 \leq j \leq n} T P L (p_{j}) + e} \end{array}$

2.2

A method for constructing an affective lexicon based on a corpus of mental texts

In the field of sentiment analysis, selecting appropriate seed words is a critical step. These seed words can significantly reflect the sentiment tendency of the text and provide an important foundation for model training and subsequent sentiment analysis. However, manual screening of seed words is both time-consuming and labor-intensive, requiring a large amount of human resources. Therefore, this paper adopts a more efficient strategy - automatic screening of seed words using the TF-IDF algorithm. Figure 3 shows the SO-PMI extended dictionary construction method based on SO-PMI. This method can effectively identify keywords with high weight in sentiment expression, thus improving the accuracy and efficiency of sentiment analysis. The seed sentiment words in the text are screened using TF-IDF, and then new sentiment words in the Twitter text are screened using SO-PMI and added to the dictionary to complete the expansion of the sentiment dictionary.

TF-IDF, i.e., Word Frequency-Inverse Document Frequency, is a commonly used text mining technique for assessing the importance of a word in a document or a corpus. The final value of TF-IDF is obtained by multiplying the value of TF (Word Frequency) by the value of IDF (Inverse Document Frequency). Both the frequency of occurrence of the word in the document and the specificity of the word in the corpus are taken into account, so that the importance of a word in sentiment analysis can be assessed more accurately. With this approach, seed words that are important for sentiment analysis can be automatically filtered out, which greatly improves the efficiency and accuracy of the filtering. (19) $T F - I D F = T F * I D F$

After receiving the seed words, the sentiment lexicon is enhanced by creating an extended sentiment lexicon using the SO-PMI algorithm. The core of the SO-PMI algorithm is based on the Peer-to-Peer Mutual Information (PMI) method, which is a widely used technique in textual sentiment analysis, mainly for assessing semantic similarity between words. The basic idea of PMI is to infer the correlation between two words by calculating the probability that they occur together in a text. The basic idea of PMI is to calculate the probability of two words appearing at the same time in a text, so as to infer the correlation between them. The basic idea of PMI is to infer the relevance of two words by calculating the probability of their simultaneous occurrence in the text. If the probability of two words word1 and word2 appearing simultaneously in the text is higher, it indicates that they are more relevant and semantically similar. The specific PMI value is calculated as shown in Equation (20). (20) $P M I (w o r d 1, w o r d 2) = \log_{2} (\frac{P (w o r d 1 & w o r d 2)}{P (w o r d 1) P (w o r d 2)})$

$P (w o r d 1 & w o r d 2)$ indicates the probability of two words word1 and word2 appearing at the same time, and $P (w o r d 1)$ and $P (w o r d 2)$ are the probability of two words appearing independently, respectively. If the greater the probability of two words co-occurring in a small range of the dataset, it indicates the greater the degree of correlation between them; on the contrary, the degree of correlation between them is smaller.

According to formula (20), the point mutual information can be calculated for words with unknown emotion type and words with known emotion type, and then by calculating the difference of mutual information between different emotion words and the specified emotion words, the emotion tendency of the word can be judged to distinguish whether it has positive or negative emotion, and the emotion tendency (SO) is introduced in the formula. SO-PMI Its calculation formula is shown in (21). (21) $\begin{array}{rcl} S O - P M I (w o r d 1) & = & \sum_{P w o r d e P w o r d s} P M I (w o r d 1, P w o r d) \\ - \sum_{N w o r d e N w o r d s} P M I (w o r d 1, N w o r d) \end{array}$

Where Pword denotes a positive seed word and Nword denotes a negative seed word, the PMI values of the candidate emotion words word and the two groups of seed words are calculated respectively. And based on the difference between the PMI values of the two groups to determine the emotional tendency value of the words to expand the emotion words. In the analysis process, if the final difference exceeds the set threshold, word word1 is regarded as a positive affective word; on the contrary, if the difference does not exceed the threshold, it is regarded as a negative affective word. Accordingly, the emotional tendency of the emotion words can be judged and the emotional intensity of the emotion words can be further obtained.

3

Application and analysis of models to mental health

Based on the combined keyword extraction model and dynamic sentiment dictionary constructed in Chapter 2, this chapter takes the discussion text of college students’ mental health courses as the research object, and systematically verifies the practical utility of this paper’s method in the identification of mental health problems and the generation of intervention strategies through the steps of participle optimization, sentiment annotation, model performance validation, and intervention effect evaluation.

3.1

Segmentation and Word Frequency Organization

A total of 400 freshmen were selected from a university, and in the whole semester mental health course, students were asked to write their own confusions around four topics: self-awareness, interpersonal relationships, love and sex, and stress and emotions, and a total of 500 valid questions were collected. The AMI algorithm of the model was used to process the discussion text of the college students’ mental health course and generate a semantically complete statistical table of high-frequency words of the topics.

First of all, data preprocessing is needed. For the text is to unify the text format and filter the content, such as removing some meaningless words or single-word words that appear more often. Use the “Segmentation” function of ROSTCM6 software to segment the text. After the word separation, the words that are weakly related to the content of the study, such as “what to do”, “what” and other meaningless words are deleted, and saved as a new word separation file.

Secondly, the filtered lexical files were subjected to word frequency statistics to obtain a longer word frequency statistical table, due to the length is too large, the word frequency is gradually decreasing, so only the first 10 keywords are listed in the frequency table. Table 1 shows the high-frequency words (the first 10) for the discussion topics of college students’ mental health courses. As can be seen from Table 1, among the high-frequency words used to analyze the mental health issues of concern to college students, the nouns are mainly emotion, love, self, relationship, friends and others; while the verbs are cognition, influence, interaction, and getting along. Among them, the four key words with the highest frequency, namely, emotion, relationship, cognition and self, have a word frequency of more than 50 times, and the highest word, “emotion”, has appeared 82 times, which shows that college students are more concerned about emotion, self and various problems in relationships.

Table 1.

College students’ mental health course discussion topics (Top10)

Number	Frequently used words	Word frequency	Number	Frequently used words	Word frequency
1	Emotion	82	6	Friend	23
2	Love	68	7	Influence	19
3	Cognize	59	8	Association	19
4	Ego	53	9	Get along with	19
5	Relationship	41	10	Other people	18

3.2

Analysis of potential triggers and coping patterns of psychological problems based on high-frequency words

From the high-frequency keywords in 3.1, we can understand the main mental health problems that college students are concerned about, and we need to further excavate the sources of students’ psychological problems, the ways of dealing with students’ psychological crisis problems and the ways of students’ psychological crisis help, etc., so as to better quantify the students’ psychological problems. Based on the SO-PMI extended affective dictionary, high-frequency words are labeled with affective polarity, combined with the differential correlation matrix to quantify the semantic association between words, and the potential triggers and coping modes of college students’ psychological problems are mined. Figure 4 shows the words that college students perceive as the source of psychological problems. Figure 5 is the psychological crisis problem handling words of college students. Figure 6 is college students’ psychological crisis help words. From these three word maps, it can be seen that most students believe that the source of psychological problems such as emotional loss of control is the influence of various aspects of pressure, including study pressure, interpersonal relationship pressure, etc.; when encountering psychological problems, most students will choose to selfdigest, and those who choose to seek help from friends and relatives, etc., are fewer than those who selfdigest; in the handling of psychological problems, most students will use eating, sleeping or crying, etc., to relieve psychological pressure. Summarizing college students’ cognition and response to psychological crisis, it can be judged that in the long run, not only the students’ psychological condition but also their body will be worse, which will easily lead to stomach disease, weight gain and other negative consequences.

3.3

Comparison of model accuracy and analysis of results

The D-S evidence theory of the model is invoked to fuse the multi-source features, compare the classification performance of the combined keyword model with MFN and GMFN, and validate its anti-interference ability and accuracy advantage in the sentiment classification calculation. The constructed emotion lexicon data is split into training data and test data, which are used for the training as well as testing of the combined keyword model, MFN multimodal temporal model and GMFN multimodal temporal model. Acc2 and Acc7 values are selected for inter-model performance evaluation, respectively.

Figure 7 shows the differential correlation matrix of the combined keyword model. In the comparison of the accuracy of sentiment classification calculation, according to the poor correlation matrix of sentiment calculation of the combined keyword model, the test results of the dataset in the combined keyword model are within the binary precision, with a positive accuracy of 0.85 and a negative accuracy of 0.81, and within the seven-element precision, the recognition accuracy of “happiness” reaches the highest of 0.86, followed by “fear” reaches 0.62, and the recognition accuracy of the remaining categories is average, which is less than 0.59. Secondly, for “surprise”, there is a 0.16 chance of identifying it as “fear” and a 0.1 chance of identifying it as neutral. For “fear”, there is a 0.15 chance that it will be recognized as “surprised”. For “disgust”, there is a 0.15 chance that it will be recognized as “anger”. For the “sad” emote, there is a 0.17 chance that it will be recognized as “neutral”.

Figure 9 shows the differential correlation matrix for the GMFN model. According to the poor correlation matrix of CMFN sentiment classification calculation, the test results of the dataset in the GMFN model are within the binary precision, with a positive accuracy of 0.76 and a negative accuracy of 0.78, and within the seven-element precision, the accuracy of “happy” expression recognition reaches 0.79, followed by “disgust” (0.55), and the accuracy of the remaining expression categories is average, both of which are less than 0.50. Secondly, for the “surprised” expression, there is a 0.23 chance of identifying it as “fear” and a 0.08 chance of identifying it as “neutral” and “happy”. For the “fearful” emote, there is a 0.22 chance that it will be recognized as “surprised”. For the “disgusted” emote, there is a 0.22 chance that it will be recognized as “angry”. For the “sad” emote, there is a 0.19 chance that it will be recognized as “neutral”.

Figure 9 shows the differential correlation matrix for the GMFN model. According to the poor correlation matrix of CMFN sentiment classification calculation, the test results of the dataset in the GMFN model are within the binary precision, with a positive accuracy of 0.76 and a negative accuracy of 0.78, and within the seven-element precision, the accuracy of “happy” expression recognition reaches 0.79, followed by “disgust” (0.55), and the accuracy of the remaining expression categories is average, both of which are less than 0.50. Secondly, for the “surprised” expression, there is a 0.23 chance of identifying it as “fear” and a 0.08 chance of identifying it as “neutral” and “happy”. For the “fearful” emote, there is a 0.22 chance that it will be recognized as “surprised”. For the “disgusted” emote, there is a 0.22 chance that it will be recognized as “angry”. For the “sad” emote, there is a 0.19 chance that it will be recognized as “neutral”.

From the three models poor correlation matrix graphs, it can be seen that the combined keyword model predicts the class labeling accuracy is greater than GMFN and MFN in binary classification, and GMFN is smaller than MFN. The classification effect of the combined keyword model is similarly greater than that of GMFN and MFN in seven-element classification. The performance of the combined keyword model sentiment classification computation is superior to that of the GMFN and MFN models, and it is able to better classify the sentiment through the mining the association between high-frequency words of mental health texts and students’ psychological problems.

3.4

Loss value simulation test and result analysis

Using the emotion dictionary training data, combined with the D-S feature fusion method, we compare the feature extraction method of college students’ mental health based on Long Short-Term Memory (LSTM) network, the feature extraction method of college students’ mental health based on RBF neural network, and the feature extraction loss value of this paper’s model to validate the prediction accuracy of the model. In the actual data training process, the loss value is used to indicate the prediction accuracy of the three prediction methods. The data training loss value is expressed as the mean square deviation between the extracted value and the true value. In the test, the set training dataset is input into the model for 4000 data iteration training, and with the increase of the number of data iteration training, the model extraction process loss data is analyzed.

Figure 10 shows the data iteration training prediction results. Figure 11 is the statistical results of data iterative training loss values. Analyzing Fig. 10 and Fig. 11 shows that in the first 700 times of data iteration training, the loss values of the three methods are in a rapid decline, and when the data iteration training reaches 1100 times, the loss values of the three methods tend to stabilize. Comparing the differences in the loss values of the three methods, the stabilized values of the loss values of the two comparison methods are 0.37 and 0.32 respectively, while the stabilized value of the loss value of the proposed method in this paper is 0.16. Moreover, in the process of feature extraction, the result of feature fusion based on the D-S evidence theory proposed in this paper has a higher degree of agreement with the training data, which can be concluded that the prediction accuracy of the model in this paper is higher.

3.5

Implementation of the intervention response

After extracting the high-frequency words of college students’ mental health problems and analyzing their emotional affiliation using the model in this paper, intervention strategies are designed and implemented in a targeted manner. 1) Comprehensively analyze and collect multidimensional data: removing students’ discussion texts in mental health courses, school administrators take the initiative to obtain college students’ behavioral, emotional, and social data from a variety of sources, which can help to assess the level of students’ mental health. Various mental health lectures are targeted to help students correctly perceive their current mental health status. 2) Targeted intervention for stressors: In response to the high-frequency words “learning pressure” and “interpersonal relationship”, school administrators and mental health experts set up stress management workshops and communication skills training, combined with the strong correlation of “stress-eating” in the poor correlation matrix, and pushed dietary health guidance resources. 3) Negative Emotional Counseling: According to the negative words such as “crying” and “venting” marked in the emotional dictionary, cognitive behavioral therapy (CBT) group counseling is carried out, and the data of students’ psychological characteristics integrated by D-S evidence theory are used to match personalized counseling programs.

3.6

Analysis of intervention results

Table 2 shows the results of SCL-90 scores of college students before and after psychological interventions. After targeted psychological intervention, the SCL-90 scale scores of somatization, obsessive-compulsive symptoms, interpersonal sensitivity, depression, anxiety, hostility, terror, paranoia, and psychoticism factors of 400 college students who had more or less negative mental health symptoms in the mental health course were significantly reduced compared with the scores of the factors before the intervention (P less than 0.01). For example, after the intervention, the “anxiety” factor score decreased from 1.94±0.61 to 1.34±0.38, a decrease of 31%; the “depression” factor decreased from 1.82±0.56 to 1.36±0.41, a decrease of 25.3%. This shows that the intervention effectively cut off the vicious circle between students’ mental health symptoms and the eventual formation of psychological problems, and provided a buffer space for students with adverse psychological symptoms. The reason for this is that the intervention strategy accurately locates the root causes of college students’ psychological problems, such as the existence of multiple stressors and improper coping methods, through text analysis, and combines the data-driven advantages of the combined keyword model to provide students with targeted interventions for mental health symptoms, and therefore significantly improves the mental health status of college students.

Table 2.

Scores of college students before and after psychological intervention

Time	Pre-intervention	Post-intervention	T-value	P
Number	400	400	-
Somatization	1.61±0.51	1.24±0 36	24.809	<0.01
Force	1.93±0.62	1.70±0.49	12.027	<0.01
Interpersonal sensitivity	1.98±0.64	1.58±0.45	21.306	<0.01
Depressed	1.82±0.56	1.36±0.41	27.595	<0.01
Anxiety	1.94±0.61	1.34±0.38	35.253	<0.01
Antagonize	1.87±0.46	1.49±0.44	24.611	<0.01
Fear	1.64±0.39	1.19±0.36	35.133	<0.01
Bigoted	1.73±0.40	1.40±0.42	23.510	<0.01
Psychosis	1.62±0.45	1.29±0.33	24.747	<0.01
Other	1.59±0.47	1.19±0.31	29.996	<0.01

Note: SCL- Symptom self-rating Scale.

4

Conclusion

Traditional monitoring means such as questionnaires are inefficient and narrow in coverage, making it difficult to quickly realize mental health monitoring and intervention. This paper constructs a combined keyword extraction model for Chinese psychological text and combines it with a dynamic sentiment dictionary extension method to improve the accuracy and timeliness of mental health monitoring and intervention, and provide data-driven optimization support for psychological crisis early warning and intervention.

Based on the AMI algorithm and differential correlation matrix, the high-frequency word list of college students’ mental health problems was obtained by word partitioning and word frequency collation. Among them, the word “emotion” appeared more than 80 times, with the highest frequency, indicating that college students’ mental health problems are mainly concentrated in the aspects of difficult to channel emotions. The source of psychological problems such as difficulty in channeling emotions is that college students are affected by multiple pressures, and it is difficult for them to obtain effective support other than self-digestion. In the binary precision of the comparison experiment, the accuracy rate of positive is 0.85, and the accuracy rate of negative is 0.81. In the seven-dimensional precision, the accuracy rate of all kinds of emotional words is higher than 0.5. The loss value is the smallest in the comparison method, only about 0.16. The model has higher accuracy and smaller loss value in the emotional classification of college students’ mental health problems. Personalized mental health interventions such as stressor-targeted intervention and negative emotion channeling are adopted according to the analysis results to intervene in college students with poor mental health symptoms. The SCL-90 scale scores of each mental health symptom after the intervention decreased significantly (P<0.01), the students’ mental health level was effectively improved, and the interventions based on the model monitoring results have effectiveness.

This study provides a reusable technical model for mental health monitoring and intervention, which supports the design of personalized intervention programs by accurately identifying the characteristics of psychological problems. In the future, we can further integrate multiple types of mental health data, such as voice, to construct a dynamic monitoring system to improve the response speed of mental health intervention.

Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 1 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Biologie, Biologie, andere, Mathematik, Angewandte Mathematik, Mathematik, Allgemeines, Physik, Physik, andere

Zeitschrift RSS Feed

Exploring the use of big data technology in mental health monitoring and intervention

Linghua Jin

Yuxuan Hou

Online veröffentlicht: 29. Sept. 2025

Eingereicht: 04. Jan. 2025

Akzeptiert: 26. Apr. 2025

DOI: https://doi.org/10.2478/amns-2025-1125

SchlüsselwörterInformation entropy, D-S evidence theory, TF-IDF, Affective lexicon, Mental health intervention

© 2025 Linghua Jin and Yuxuan Hou, published by Sciendo.

This work is licensed under the Creative Commons Attribution 4.0 International License.

Schlüsselwörter
Information entropy, D-S evidence theory, TF-IDF, Affective lexicon, Mental health intervention