Accès libre

Sentiment Analysis of Korean Modern Novel Texts Applying Deep Learning Models

  
26 mars 2025
À propos de cet article

Citez
Télécharger la couverture

Introduction

Sentiment analysis, as an important discipline of natural language processing technology, plays an increasingly important role in the Web 2.0 era and has been widely emphasized [1-2]. Sentiment analysis technology, also known as opinion mining, refers to the process of analyzing, processing, summarizing and reasoning about subjective texts with emotional overtones [3]. In the era of big data, the public has more and more vocal comment channels for products, services, etc., such as various types of apps, social media, etc., on which comments with praising or depreciating nature are left [4-5]. In the field of literature, some scholars have already begun to use sentiment analysis techniques to study the evaluation of various books, and then obtain and summarize data and information of practical significance, such as book popularity and future development prediction [6-8]. Sentiment analysis techniques are applied to a large number of book reviews to quantify readers’ emotional attitudes and analyze popular reviews and the characteristics of novels that readers are most concerned about [9-11].

In today’s era, cultural influence has become an important indicator of a country’s comprehensive national power. Great powers whose economic and political strengths have developed to a certain extent inevitably seek worldwide appeal, and the attraction and appeal of the country’s culture to the people of other countries is an extremely important part of it [12-15]. The Korean modern novel unfolds a discussion on the position of the individual in a rapidly changing society and the conflict between tradition and modernity. There are many backgrounds in which modern Korean novels have arisen, which are closely related to the historical changes in Korean society. Along with the rise of the democratization movement in Korea, deeper critical thinking on political and social issues emerged in literary creation [16]. Since the new century, modern Korean novels have continued to develop, not only in terms of increasingly diversified content, but also in terms of innovative forms, with the emergence of network novels as a new form of literature, reflecting the complexity and diversity of contemporary Korean society [17-18]. Readers’ acceptance of a work is the key to realizing the intended meaning of a literary work, and the lack of research on readers may make the interpretation of a work appear incomplete [19-20]. Therefore, the use of sentiment analysis techniques to collect and analyze users’ positive or negative evaluations of Korean novels provides a reference to improve the cultural interpretation and dissemination of modern Korean novels.

Aiming at the traditional sentiment analysis task in which the use of models such as Word2Vec to generate text word vectors cannot effectively solve the problem of polysemous word representations, and models such as classical RNN cannot adequately extract semantic features, the BERT-BGRU text sentiment analysis model for modern Korean novels combined with the attention mechanism is proposed. After pre-processing the text of modern Korean novels, the text is vectorized using the BERT pre-trained language model, and then the text information features are extracted from the front and back directions using the BGRU network, so that the features can be integrated with the contextual information, and then combined with the attention mechanism, the extracted feature information is weighted and computed, so that the model pays more attention to the key information, and the final output is the result of the sentiment classification. After verifying the performance of the model in the relevant dataset, we use Park Wan-su’s novels as an example to sort out the emotional tone of his novel text and perform sentiment analysis.

A Model for Sentiment Analysis of Modern Korean Fiction Texts

The basic process of text sentiment analysis of Korean modern novels using deep learning methods is shown in Fig. 1, where the text data is preprocessed, then the organized text is converted into word vectors, the constructed neural network model is used to extract the text features, and the final output of the text is the result of the sentiment classification, which includes the processes of natural language processing, text vectorization representation, and so on.

Figure 1.

Basic flow chart of emotion analysis

Natural Language Processing
Text pre-processing

In a long novel text, words are the basic units of the text. When pre-processing a long text, it is necessary to encode it for conversion, word separation, de-escalation, lexical annotation, lexical form reduction, named entity recognition, lexical meaning disambiguation, and other text processing.

Text Segmentation

When doing text processing, the first preprocessing to be done is word separation. According to the different nature of the composition of the language, in the English text, each word in the sentence is separated by a space as a separator, so the English language can be divided into words according to the space. In English text preprocessing, the processing of participle is relatively simple, modern participle is based on statistical participle, and most of the statistical samples come from some standard corpus. In the text of the word processing, the main objectives are to separate words and punctuation, to separate words whose last digit is a comma or an apostrophe, to separate words divided by quotation marks, to treat most of the punctuation as a separate word, and so on.

Deactivation Filtering

After the word separation process, the long text is transformed into separate words for subsequent processing. In the text analysis work can be found, there are many meaningless function words or difficult to help subsequent analysis and reduce the processing efficiency of the words appear in the text, then the need for deactivation of word filtering. Discontinued words means that in text processing, in order to reduce the impact of the processing results and errors, while improving the efficiency of retrieval, in processing natural language data need to filter out certain words or words, these words or words are also called deactivated words. When filtering deactivated words, according to the deactivated word list of the text in related fields and manual input, the words that appear in the deactivated list are removed, which greatly improves the accuracy and efficiency for the post-processing work and purifies the analysis results.

Lexical annotation

Lexical annotation, also known as word class annotation or abbreviation annotation, refers to labeling a correct lexical property for each word as a result of word division. Lexical annotation is the main task of lexical analysis. Lexicity is the basic grammatical attribute of vocabulary. To determine the lexical status of each word, it is labeled to provide assistance for further analysis.

Word Form Reduction

In English text, due to changes in word forms, a root word may correspond to several different string forms, which need to be processed appropriately in lexical analysis, such as prefixes, suffixes, and endings. Morphological reduction is the reduction of a word of any form to its basic form. Lexical reduction is primarily utilized in text mining and natural language processing to provide more precise and finer-grained text analysis and representation.

Named entity identification

Named entity recognition is a common task in natural language processing based on words and has a wide range of applications in many text analysis characters. Named entities are usually used to refer to entities that have special significance or strong referential importance in the text, usually including names of people, places, time, proper nouns, etc. Named entity recognition system is to extract the above entities from unstructured text and can recognize more categories of entities [21].

In the novel text, the named entities contained and the relationship between the entities have an important value for analyzing the text, and this information can be used to show the relevant information about a specific character in the novel, which is very useful for text analysis and text visualization. The information obtained through entity extraction can most intuitively express the characters and other identifiers in the novel text, which has great practical value, and can also be used in the analysis of key characters, character relationships, and character emotions. Since most novel texts are unstructured text data, the number of named entities is huge, and there are no specific rules to follow. Currently, the commonly used methods for this task are rule-based and model-based.

Rule-based entity extraction is the easiest way to implement. For entities with special contexts, or texts where the entities themselves are characterized by many attributes, the approach of using rule-based extraction for named entities is simple and effective. In the case of a short corpus and rules can be extracted by regular expressions, but if the entities in the text to be extracted are complex the rules have to be modified to satisfy all possible cases. As the amount of corpus increases, the situations to be faced become more complex and the rules may conflict with each other. Therefore, the rule-based approach is more suitable for semi-structured or standardized text extraction tasks.

In the model-based approach, the named entity recognition problem is actually the sequence labeling problem. Sequence labeling problem means that the input of the model is a sequence, and for each unit of the input sequence, a specific label is output. Common sequence labeling models include HMM, CRF, RNN, and other models.

CRF (Conditional random field) is a discriminative model that satisfies pairwise, local and global Markovianity. For sequence labeling problems, linear chain conditional random fields are generally used [22].

A conditional random field is an undirected graph model that guesses the probability of an output sequence given an input sequence. Assume that X={x1,x2,x3,xT} denotes the set of observed sequences and Y={y1,y2,y3,yT} denotes the set of labeled states being predicted.

An undirected graph G = (V, E) where V denotes the set of vertices and E denotes the set of edges. Then the markers S are used as vertex indexes, i.e., Y={Yv|VV} , i.e., each node in V corresponds to a variable of the labeled sequence denoted by a random variable Yv. When each random variable Yv has Markovianity, then (X, Y) is called a Markov-conditional random field, with probability of the random variable: p(Yv|X,Yu,uv,{u,v}V)

According to graph theory, the construction of Fig. G can be arbitrary, and he implies the condition of independence in the labeling sequence. From there, a simple first-order chain structure is used as an example, where each node in the construction corresponds to an element in S, as shown in Figure 2.

Figure 2.

The graphic structure of the chain CRF

The basic steps that need to be implemented in the statistical model mainly include: statistical model, extraction of features, labeling and training model, using machine learning methods to obtain the corresponding statistical model, and the test corpus is uniformly labeled, the corresponding CRF model will be obtained. The tested sequences are subjected to subsequent processing, so that the specific results of named entity recognition can be obtained.

Vector representation of text
Discrete representation based on TF-IDF

For some shortcomings in one-hot coding, an optimization method is to give a weight to each word in the text, and it is natural to think that using the frequency of word occurrence as the weight of each word is the easiest way, but the higher the frequency of the word, does not mean that the word has a greater impact on the meaning of the sentence, such as “the” in English, “ah”, “of” in Chinese, etc., which appear very frequently in daily use, but have a very low impact on the meaning of the sentence.

Word Frequency - Inverse Document Frequency (TF-IDF) is a more effective way of calculating frequency weights, where TF (Term Frequency>represents the frequency of a word being used in a document. The IDF (Inverse Document Frequency) refers to the degree of prevalence of the word, the fewer the number of documents containing the word, the greater the IDF, which means that the use of the word can be very good for the identification of different documents. the overall formula for the TF-JIDF is: TFIDF=FrequencyofoccurrenceofthewordwinthedocumentTotalnumberofwordsinthedocument×logTotalnumberofdocumentsinthecorpusNumberofdocumentscontainingthewordw+1

This method can effectively reduce the weight of irrelevant words and interfering words and retain important information. The TF-IDF method can solve the problem of weight distribution, but there is another important problem of word vector space representation, because the text in the process of matrix representation due to the row transformation characteristics of the matrix, the matrix after the row transformation and the original matrix is equivalent, and does not take into account the different order of words in the text may lead to the emotional semantics may be different.

Distributed representation based on the Word2Vec model

Word2Vec model was proposed by Google in 2013, its basic algorithmic idea is based on the word vector training model proposed by Mikolov et al. Its algorithmic complexity is lower compared to the traditional model, and it can acquire word vectors efficiently.The Word2Vec model consists of two basic network structures, which are the Continuous Bag of Words (CBOW) model and the Skip-gram model [23]. .

The structure of the CBOW model is shown in Fig. 3, and the basic idea is to predict the probability of occurrence of a target word Wt given its context as input. As shown in Eq. (3), where Sj is the jth sentence in text T, wij denotes the ith word in the sentence, and its context is denoted as Cij, and θ is the default parameter, the weight matrix obtained at the end of training is the word vector matrix of the text. argmaxθjT[ij=1sjp(wij|Cij;θ)]

Figure 3.

CBOW model structure

The Skip-gram model can be viewed as the inverse structure of the CBOW model, and the optimization objective of the Skip-gram model is to maximize the posterior probability of the text: argmaxθwijT[cCijp(c|wij;θ)]

BERT-BGRU Sentiment Analysis Incorporating Attention Mechanisms
BERT pre-training models

The BERT model adopts the Encoder part of Transformer, which is different from the previous LSTM encoding from left to right or right to left, it utilizes bidirectional encoding representation, and it is pre-trained on a large English corpus to fully learn the relevant semantic knowledge.The BERT model solves the problem of multiple meanings of a word which cannot be solved by the traditional word embedding models, such as Word2vec and Glove, so that the input sentence obtains a better word vector representation and semantic representation. The BERT model solves the problem of multiple meanings of a word which cannot be solved by traditional word embedding models such as Word2vec, Glove, etc., so that the input sentence can get better word vector representation and semantic representation.

BERT uses the following two approaches for pre-training:

Masked language model (MLM)

As in the case of completions, the MLM randomly masks a certain percentage of words and uses [MASK] markers to replace these masked words, causing the model to learn the words that should be filled in position [MASK] through the global context, realizing the ability of bidirectional encoding.

Next Sentence Prediction (NSP)

NSP can be viewed as a binary classification problem for sentences, where two sentences in a text are given, and the logical relationship between the sentences is mined by determining whether the latter sentence comes after the previous one.

Through the joint learning of these two pre-training tasks, the BERT pre-training model is able to capture the semantic information contained in the sentences more accurately and comprehensively, and also provide appropriate initial values of model training parameters for the subsequent model fine-tuning task.

BGRU layer

Unidirectional GRUs can only capture semantic information in one direction, whereas in text sentiment categorization, the output of the current moment is not only related to the previous information, but also to the information that comes after it. Therefore, this chapter uses a bidirectional gated recurrent neural network (BGRU) to capture semantic information from both directions.

BGRU solves the problem that unidirectional GRU networks cannot encode from backward to forward. BGRU combines forward and backward GRUs, as shown in Fig. 4, which can capture both forward and backward semantic information, and combine with the context to deeply extract text sentiment features.

Figure 4.

BGRmodel

In the BGRU model, two GRU units with opposite directions coexist at each moment, where ht represents the forward transmission of the GRU at moment t, ht¯ represents the reverse transmission of the GRU at moment t, ht represents the output of the BGRU at moment t, and xt is the input at moment t. The computation of the state at each moment in the BGRU model is shown in Eqs. (5) and (6), and the final output is jointly determined by the GRU units of the two directions, as shown in Eq. (7). GRU units are jointly determined as shown in Eq. (7). ht=GRU(xt,ht1) ht=GRU(xt,ht1¯) ht=wtht+vtht+bt

where: GRU() denotes a nonlinear transformation of the input to encode the word vector into the corresponding hidden layer state; wt is the forward propagated weight coefficient matrix; vt is the inverse backpropagated weight coefficient matrix; and bt is the bias value at moment t.

Attention layer

To address the problem that the BGRU layer does not consider the sentiment weight of English words, this chapter adds an attention layer after the BGRU layer. The attention mechanism can mark and identify the contribution of different words to the sentiment polarity of the text, allowing the model to ignore unimportant information and focus on the key information, which improves the accuracy of classification. The attention mechanism uses weight coefficients to indicate the importance of the information, and larger weight coefficients indicate more important information.

The calculation process of the attention layer is as follows:

The feature representation ht obtained from BGRU learning is used as the input to the attention layer, w is the weight matrix, and b is the bias vector to compute the target attention weight ut, as shown in (8): ut=tanh(wht+b)

The target weights are normalized by the Softmax function to obtain the weight coefficient at: at=softmax(ut)=exp(ut)i=1nexp(ut)

To carry out the weight configuration, the feature representation hi is calculated by weighted summing with the weight coefficient at to obtain the feature representation st that highlights the emotional importance, and the calculation formula is shown in (10): st=i=1natht

Model structure

The structure of the BERT-BGRU model incorporating the attention mechanism proposed in this chapter is shown in Figure 5.

Figure 5.

The bert-bgru-att model structure

It is mainly divided into the following parts:

Input layer

Input preprocessed English text data, the text is denoted as {w1,w2,w3wn} and wi denotes the words in the sentence.

BERT Layer

Through the training of the BERT model to obtain the vector representation of the text, i.e., the text is processed into a form that can be received by the BGRU.

BGRU Layer

It includes a forward GRU part and a backward GRU part. This layer traverses the obtained word vectors in both forward and backward directions to extract feature information. The output information is ht=[ht,ht] , where hi denotes the forward transmission of the GRU at moment t; ht¯ denotes the reverse transmission of the GRU at moment t; and ht denotes the output of the BGRU at moment t.

Attention layer

The attention layer inputs the extracted feature information and performs weighting calculation to strengthen the feature extraction of important information.

Output Layer

The processed feature vector is input to the fully connected layer and sentiment classification is performed using Softmax classifier, the calculation process is shown in equation (11): y=softmax(wst+b)

where st denotes the upper output feature vector, w denotes the weight matrix, and b denotes the bias.

The output vector y is a multidimensional vector with dimensions equal to the number of classification categories, the value of each dimension corresponds to each sentiment classification probability, and each value in y is mapped to the range of 0~1 by Softmax function, and finally the category with the largest probability value is selected as the prediction result.The formula of Softmax is shown in equation (12): fi=exp(y)j=1nexp(y)

Emotional Analysis of Korean Modern Fiction Texts
Sentiment Analysis Experiments and Analysis
Data sets

The SST-2 dataset and IMDB dataset were selected for the experiment. In the SST-2 dataset, the average length of the text is about 19.29, and the median length is 11. In the IMDB dataset, the average length of the text is about 585.26, and the median length is 159, which is used to check the performance of the model on text data of different lengths.

SST is a sentiment analysis dataset released by Stanford University. Its main content is movie reviews, and the statistics of the sample size of the dataset are shown in Table 1.

Sst-2 data set

Sentence number Positive Negativity
Training set 6524 3472 3241
Verification set 785 393 382
Test set 1745 886 893

The BERT model achieves the effect of reducing the lexical dimension by breaking down uncommon words into smaller words or proto-words with prefixes and suffixes, etc., when performing word splitting. Therefore, the statistics of sentence length here are relatively larger in value compared to the common length.

For the SST-2 dataset and IMDB dataset, the statistics on the distribution of sentiment words are presented in this section.The distribution of sentiment words in the SST-2 dataset and IMDB dataset are shown in Fig. 6 and Fig. 7. Where the horizontal axis is the text serial number in the dataset, and the vertical axis is the number of emotion words contained in the text corresponding to that serial number.

Figure 6.

Distribution of sentiment words in SST-2

Figure 7.

Distribution of sentiment words in IMDB

According to the statistical results, the total number of words in the SST-2 dataset is 124,575, and the total number of emotion words contained in it is 11,251, which is 9.03% of the total number of emotion words, while the total number of words in the IMDB dataset is 5,511,214, and the total number of emotion words contained in it is 381,251, which is 6.92% of the total number of emotion words.

Evaluation indicators

Experiments were conducted to evaluate the model’s performance using precision (P), recall (R), and F1 values. Where precision rate refers to the ratio of the number of positive classes in the predicted results to the total results. Recall refers to the proportion of positive classes that are completely identified in the sample, while F1 value is the result of the average of precision and recall, which is used to balance the distortion of precision and recall in extreme cases. The confusion matrix is shown in Table 2.

Confusion matrix

Answer Positive Negative
Forecast
Positive) Kidney-YANG False resistance
Negative False Yin Kidney-YIN

Where TP and TN are the positive and negative label categories whose model classification is the same as the actual labeling, respectively, FP is the label category identified as positive by the system but actually labeled as negative, and FN is the label category labeled as negative by the system but actually labeled as positive.

The calculation process of precision, recall and F1 value is shown in Eqs. (13), (14) and (15). P=TPTP+FP R=TPTP+FN F1=2PRP+R

In order to reduce the bias due to the initialization of parameters, this chapter uses a different SEED for each model to initialize, and after conducting five experiments, the average value is taken as the final result. The calculation is shown in Equation (16). F1_score_mean=i=15F1_score(i)5

where F1_score_mean is the final average F1 value.

Experimental results and analysis

The experimental results of the BHW_LEX_GA model proposed in this section are compared with the existing models. In order to evaluate the performance of the models in general, the experiments use different initialization values of the parameters, and the average of the results of several experiments is taken as the final result, and the performance of each model in the three metrics of Precision, Recall, and F1 value is recorded.

The experiments are described as follows:

Model (1): BERT, using only the BERT model as the word embedding method, using the [CLS] token in the last hidden layer, and directly accessing the linear layer afterward, this method serves as the baseline for comparing the effect of other models.

Model (2): BG, using the BERT model as the word embedding method, without any processing of the hidden layer, taking the last hidden layer results into the downstream neural network. Bidirectional GRU is chosen as the downstream neural network.

Model (3): BL, the word embedding part is the same as model (2), and the downstream neural network part is replaced from bi-directional GRU to bi-directional LSTM.

Model (4): BGA, the embedding layer and downstream neural network are constructed the same as model (2), and the attention mechanism is introduced in the model.

Model (5): BLA, the embedding layer and downstream neural network are constructed the same as model (3), and the attention mechanism is introduced in the model.

Model (6): BH, using the BERT model as the word embedding method, fusing some hidden layers in the model, and the hidden layer weights are all set the same and fixed values. Other parts of the structure are the same as model (4).

Model (7): BHW, using the BERT model as a word embedding method, fuses part of the hidden layers in the model with the same initial values of the hidden layer weights, and the weights participate in the training process as parameters that can be learned by the network. Other parts of the structure are the same as model (4).

Model (8): B_LEX, using the sentiment dictionary to filter the text and incorporate the sentiment word information into the model, the other parts are the same as model (1).

Model (9): BHW_LEX, using the BERT model as a word embedding method, dynamic weighting of the hidden layer of the BERT model, and incorporating emotion word information, other parts are the same as model (6).

Model (10): BHW_LEX_GA, the proposed new model, uses the BERT model as a word embedding method, adds emotion word information with the help of an emotion dictionary, performs weighted fusion on the hidden layer of the BERT model, and also participates in the computation of weights as model parameters. The downstream neural network uses the bidirectional GRU model and introduces the attention mechanism at the end.

The performance of each model on the SST-2 dataset is shown in Table 3. The performance of each model on the IMDB dataset is shown in Table 4. As shown by the results in Table 3 and Table 4, the BERT-BGRU model improves the F1 value by 0.849 on the SST-2 dataset compared to the BERT model with only linear layers, while on the IMDB dataset it improves the F1 value by 0.960 compared to the BERT model with only linear layers.BERT-BGRU achieves the best performance, while on the IMDB dataset, the BERT-BGRU model obtained the second best performance after model (8).

Results of the experiment on SST-2

Model serial number Model Accuracy rate (P) Recall rate (R) F1 value (F1-score)
(1) BERT 87.289 95.586 91.137
(2) BG 90.649 93.275 91.914
(3) BL 87.533 94.926 91.071
(4) BGA 91.323 92.212 91.748
(5) BLA 91.533 92.432 91.931
(6) BH 89.523 93.826 91.595
(7) BHW 90.595 93.092 91.816
(8) B_LEX 88.354 94.339 91.237
(9) BHW_LEX 88.24 95.146 91.557
(10) BERT-BGRU 89.935 94.082 91.976

Results of the experiment on IMDB

Model serial number Model Accuracy rate (P) Recall rate (R) F1 value (F1-score)
(1) BERT 86.592 95.182 90.316
(2) BG 89.986 92.395 91.099
(3) BL 89.305 93.129 91.128
(4) BGA 90.12 91.589 90.789
(5) BLA 88.515 93.422 90.897
(6) BH 89.249 91.258 89.978
(7) BHW 92.815 88.985 90.844
(8) B_LEX 91.276 91.295 91.281
(9) BHW_LEX 90.166 91.699 90.861
(10) BERT-BGRU 87.965 94.669 91.276

The BERT-BGRU model has the best combined performance on both the SST-2 and IMDB datasets, obtaining better results and stability compared to the other models.

Analysis of Emotional Channels in Park Wan-su’s Novels
Maternal Colors and Emotional Veins in Park Wan-su’s Novels

Korea is a typical oriental culture country, deeply influenced by Chinese Confucian feudalism, and although the society is highly developed economically, the patriarchal ideology is still very strong, and discrimination against women fills every corner of the society.

As Park Wan-su’s debut novel and also her famous novel “Naked Wood”, she takes the Korean War, which has profoundly influenced the modern history of Korea, as the background, and describes people’s desire for love, their extravagant hope for happiness, and their deep inner loneliness through the story of the social status quo of those who are unable to live a normal life in the era of the war.

The delicate emotions in Park Wan-su’s novel are found throughout the novel. In the novel, the mother feels that her life has lost its meaning because of the death of her son, who died in the war, and she is unable to do anything, does not dress up, and has no interest in household chores. As if her life disappeared along with her son, she lived in her own imagination and phantasmagoria, in a state of loss of vitality, just like a walking corpse.

In her other masterpiece “Mom’s Stakes”, the image of motherhood is depicted as both paternal and selfless, reflecting the coexistence of fatherhood and motherhood. After the death of her husband, the mother in the novel has to become the head of the family, taking up the heavy responsibility of supporting the family and educating the children. In order to get rid of old habits and traditions, she leaves her hometown and comes to the city with her children, declaring war on the traditional patriarchal system with her own stubbornness and independence, trying to find women’s self-respect and self-reliance.

In previous literature, influenced by the strong patriarchal discourse, it is often the father who imposes his will on his children, but in “Mom’s Stakes”, due to the lack of fatherhood and the pressure of life, the mother plays this role as a matter of course, and the image of a strong mother is thus formed. In the text, the mother cuts off her daughter’s hair without regard to her daughter’s opinion. In addition, the mother decided to move to Seoul without consulting the child. In the daughter’s view, “Mother is a person who doesn’t consult with anyone and is completely arbitrary”.

However, no matter how strong the maternal figure is, the maternal love for her children that she was born with is undeniable and inexhaustible. In the novel, it is told that due to the war, there is not enough food, so the mother, in order to feed her children, goes to beg for food in spite of the danger of being caught by the gendarmes and may lose her life at any time. This episode fully demonstrates that the image of motherhood in the text has both paternal characteristics of imposing one’s own will on the children and traditional maternal characteristics of a selfless and fearless loving mother for the sake of her children.

In this paper, we will take “Mom’s Stakes” as an example to analyze the emotions in the novel and analyze the emotional curve of the characters’ story.

Generation of the emotional profile of “Mom’s Stakes” novels

Figures 8 and 9 show the sentiment curves of the novel “Mom’s Stakes”. Figure 8 is a fixed-length sentiment curve generated according to previous work, and Figure 9 is a variable-length sentiment curve generated according to the methodology presented in this paper.

Figure 8.

“Mom’s Stakes” sets the emotional curve

Figure 9.

“Mom’s Stakes” grows emotional curves

In general, the local maximums of the sentiment curves generated from this set of comparisons move relatively similarly. However, it can be found in the figure that the fixed-length curves appear to be smoother. Analyzing the specific data, the total number of words in the curve plotting stage of the “Mom’s Stakes” novel text after text preprocessing is 27,538 words. Due to the limitation of the target of the previous generation curve method, while the window size of the sampled text is fixed, the same number of times must be sampled for different texts to get the same length of the curve. Then for the fixed setting, the window length is 10000 and the time series length is 200, the actual window spacing is (27538-10000-1)/200~86, that is, it is equivalent to a moving average operation for the time series with a window length of 86 for about 10000/86~116 times the window. Compared to the 4 times window moving average set for the generation of variable length curves by the method in this paper, it is too large. In fact, observation of the obtained generation curves shows that the generated fixed-length curves, although having higher temporal resolution, are significantly smoother as well.

If the generated curves are too smooth, the effective information in the original text will be lost. Observing the two generated curves, we can find that in the first half of the curves, the emotion of the text shows a gradual upward trend in general. However, the variable-length curve generated by the method of this paper shows the emotional fluctuation more clearly, while the method of generating the fixed-length curve smooths out most of the emotional fluctuation. Scholars believe that this smoothing actually causes the loss of information in the original text.

The smoothing phenomenon of the original method is more apparent in the last 20% of the curve. In “Mom’s Stakes”, in order to emphasize the mother’s authoritarianism and stubbornness, the mother ignores her daughter’s opinion and cuts off her daughter’s hair without her permission, and the daughter experiences a lot of negative emotions. In the end, with the daughter’s understanding, the emotions begin to calm down.

The curves generated by this paper’s method reflect this change very well, while the fixed-length curves generated by the previous method only show the situation of falling back from the highest point to the mean value, and do not reflect the negative emotions that are concentrated at the end.

In conclusion, for the emotion curve generated by the novel “Mom’s Stakes” as a sample, the emotion curve generated by this paper’s method solves the problem that the smooth excess generated by the previous methods under their configuration cannot reflect the characteristics of the text, and it can better fit the emotion change of the original novel text.

Conclusion

The study utilizes deep learning methods to construct a model for sentiment analysis of modern Korean novels, and then proposes a sentiment analysis method for BERT-BGRU that incorporates the attention mechanism. IMDB and SST-2 datasets are selected and BERT-BGRU is evaluated against existing methods. The experimental results show that compared with the method that only uses the BERT model as a word embedding method connected to the downstream network, BERT-BGRU improves 0.849 and 0.960 on the F1 value on the SST-2 and IMDB datasets, and the BERT-BGRU method all obtains better results. Then, taking the novel “Mom’s Stakes” by Park Wanxu as an example, this paper compares a method proposed by the predecessors to generate a fixed-length emotion curve with the BERT-BGRU emotion model proposed in this paper, and the experimental results show that the BERT-BGRU emotion model can better express the emotion curve characteristics of the novel text.