Otwarty dostęp

Pattern Recognition and Deep Semantic Network Analysis Techniques for Rhetorical Devices in English Literary Texts

  
19 mar 2025

Zacytuj
Pobierz okładkę

Introduction

With the continuous passage of time, English literature has gradually stepped into the mature stage, and formed a distinctive style, skills, to summarize, English literature mainly exists in the following aspects of the characteristics of the spoken and written differences, the expression of emotion is more extensive, and there is ambiguity in the language [1-4]. And the application of fuzzy semantics in English literature is more extensive, is also one of the characteristics of literature, in the process of reading the work can provide readers with greater imagination and richer reading experience, to enhance the charm of literary works [5-8].

Meanwhile, rhetorical devices have been widely used and promoted in English literature, playing an incomparable role and advantage, and have been highly valued and paid attention to by many English literature authors [9-10]. An excellent literary work can express the author’s own thoughts and feelings, so that the readers can understand and feel from it, and have the same thought resonance with the author [11-12]. Generally speaking, symbols, metaphors, anthropomorphism, humor and hyperbole belong to the important rhetorical devices, and each rhetorical device, its artistic characteristics and artistic style are unique, which can truly convey the inner world of the characters [13-14]. Therefore, authors must strengthen the application of rhetorical devices to enhance the cultural and intellectual connotation of their works, which plays an important role in promoting the level of English literature, ensuring that English literature obtains a high degree of satisfaction from the public, and upgrading English literature to a whole new level and depth [15-17].

The article firstly selects three rhetorical devices, namely, prose, metaphor and personification, for the construction of rhetorical recognition model, and then adopts the architecture of bidirectional multi-angle matching model for the construction of bidirectional multi-angle matching model based on convolutional neural network, and sets up and trains the model. Further, the article proposes a deep learning sentiment analysis model based on long and short-term memory, and constructs a deep learning sentiment analysis model integrating semantic knowledge from the perspectives of semantic extraction and deep semantic embedding to realize the effective sentiment classification of English literary texts. Finally, the complex network text sentiment analysis model based on LSTM proposed in this paper is compared with the experiments and the real results are shown.

Construction of a model for recognizing rhetorical devices
Convolutional Neural Network Modeling

The input layer chosen in this study is a fixed-length two-dimensional matrix, followed by simultaneous convolution operations with three convolution kernels of different sizes for local feature extraction of the input text, followed by a pooling layer to maximize the pooling operation of the input three types of feature maps to achieve local feature re-extraction [18-19]. The model’s parameter settings will be explained in detail below.

Input layer

The requirement of the convolutional neural network model for the input layer is in the form of a two-dimensional matrix, so the input layer of this experiment is a two-dimensional matrix of the word vectors of metaphorical and anthropomorphic rhetorical data. To achieve the most realistic access to text meaning when extracting text features, all word vectors are stacked vertically.

Convolutional layer

The convolutional layer is the core layer of feature extraction in a convolutional neural network model, with the characteristics of local awareness, parameter sharing, and multiple convolutional kernels. When the input layer is fed with a two-dimensional matrix of text word vectors stacked vertically, the output of the feature is shown in Equation (1) if x represents the word vector corresponding to the input word in size dimension: Ci=f( wx+bias)

Where, w is the weight matrix, x is the input layer feature representation, ci is the output layer, bias is the bias term and f is the nonlinear activation function.

Pooling layer

The pooling layer is the output of the convolutional layer for the next step of compression and maximum feature extraction. The sampling methods of the pooling layer are: the maximum value pooling operation selects the maximum value of each block, and the mean value pooling operation selects the average value of each block. In this experiment, the maximum value pooling operation is selected, and the principle of pooling calculation is shown in Fig. 1.

Figure 1.

Schematic diagram of pooling calculation

Fully connected layer

The fully-connected layer sends the 128 local features extracted at the top in the form of fully-connected to the Softmax classifier to consider and classify all the local features comprehensively, and uses the Dropout strategy to train, and finally builds the classification model to complete the rhetorical devices.

Parameters of the model

The dimension of the input layer in this experiment is 180*128, and the three convolution kernels designed in the convolutional layer are 3*128, 4*128, and 5*128, respectively, to obtain three different types of feature maps. Maximum pooling operation is used, and the size of the pooling kernel is chosen as 176*1, 177*1, and 178*1, respectively, and the fully connected layer feeds the 128 local features extracted in the forefront into the Softmax classifier in the form of fully connected to complete the classification of the data of platitude, anthropomorphism, and metaphoric rhetoric, and the parameters of the convolutional neural network model are shown in Table 1.

Convolution neural network model parameters

Model layer name size Algorithm/method
Input layer 180*128 Longitudinal stack
Convolution layer 3*128,4*128,5*128 Relu activation function
Pooling layer 176*1,177*1,178*1 Max-Pooling
Full junction 128 The relu Dropout strategy
Bidirectional Angle Matching Model Architecture

In this study, a two-way multi-angle matching model is used, and the implementation process of the model is described in detail below.

The first step is the sentence representation layer. Firstly, the pre-trained word vectors are utilized to pair the input sentence S1, corresponding to a sequence of word vectors of [p1,p2,…,pM], with the sentence S2 to be matched in the citation repository, corresponding to a sequence of word vectors of [p1,p2,…,pN] into, where M and N are the lengths of S1 and S2, respectively.

In the second step, the text representation layer. The bi-directional long and short-term memory models are utilized to generate the hidden state sequences of sentences S1 and S2 as (h1p,h2p,,hMp) and (h1q,h2q,,hNq) , respectively.

Step 3, Matching layer. The matching operation is the core of the model, and its inputs are divided into two directions, S1→S2 and S1←S2, which is the meaning of the model’s “bi-directional”. Based on (h1p,h2p,,hMp) and (h1q,h2q,,hNq) , the information interaction between S1 and S2 is calculated, and the vector identification sequences (m1p,m2p,,mMp) and (m1q,m2q,,mNq) are generated after the interaction between S1 and S2, mip is used as an example to illustrate the calculation of the information interaction, and miq is obtained in the same way. The calculation formula (2) is: m=fm(ν1,ν2;W)=[ m1ml ]=[ cos(W1*ν1,W1*ν2)cos(Wl*ν1,Wl*ν2) ]Rl

The full-center matching approach selects the output of S2 at the last moment of the bidirectional long-short-term memory network as v2, so v2 has the full text semantic information of S2. Since BiLSTM has positive and negative outputs, two matching features can be obtained: mifull=fm(hiS1,hNS2;W) mifull=fm(hiS1,hNS2;W)

The second, maximum pooling matching interaction. Maximum pooling matching selects the output of each moment in S2 as v2 Calculation fm, and finally selects the maximum value as the output. The formula is represented in (5): mimax=maxj(1,,N)fm(hiS1,hjS2;W)m¯imax=maxj(1,,N)fm(hiS1,hjS2;W)

The third, attention matching interaction approach. The attention matching approach starts with the cosine similarity of the output to hiS1 or hiS1 for each moment of S2. The formula is represented in (6): αi,j=cos(hiS1,hiS2)αi,j=cos(hiS1,hiS2)

Subsequently, using αi,j or αi,j as weights, the sum of all hiS2 or his2 in S2 is computed as v2. The specific formula is indicated in (7): him=j=1Nαi,jhjS2j=1Nαi,jhim=j=1Nαi,jhjS2j=1Nαi,j

Subsequently, fm is calculated and Eq. is expressed in (8): miAn=fm(hiS1,hjm;W)miAn=fm(hiS1,hjm;W)

Fourth, the maximum attention matching approach. It is similar to the attention matching approach, but the final selection is the maximum value instead of using summation. The formula is represented in (9): him=maxj(1N)αi,jhjS2him=maxj(1N)αi,jhjS2miAttMax=fm(hiS1,hjm;W)miAttMax=fm(hiS1,hjm;W)

Step 4, aggregation layer. Use the long and short time memory network for the two matching sequences respectively, and then connect the last time sequence vector of the long and short time memory network to get the final matching vector.

Fifth layer, prediction layer. Here the four matching vectors obtained from the aggregation layer are transformed into a probability distribution Pro(yS1|S2) using a fully connected layer and a softmax function to calculate the cross entropy with the labeled labels [20-21].

Model Setup and Training

In this study, the model parameters are set and trained to complete the final model for rhetorical devices classification and recognition.

Activation function

In order to enhance the expressive ability of the neural network, it is necessary to introduce a nonlinear function as the activation function. Currently, the commonly used activation functions mainly include Sigmoid function, tanh function and relu function.

The Sigmoid function is a commonly used nonlinear activation function, and the mathematical form (10) is: f(z)=11+ez

The mathematical form of the tanh function (11) is: tanh(x)=exexex+ex

The advantage is that the problem of not zero-mean output of the Sigmoid function is solved. The disadvantage is that the problem of vanishing gradients and the problem of power operations remain.

The mathematical form of the relu function (12) is: relu=max(0,x)

Its advantage is that it solves the gradient vanishing problem in positive intervals, and it is faster to compute and converge than Sigmoid and tanh functions. In this study, the relu function is used for convolutional and fully connected layers.

Dropout layer setup

The Dropout strategy is introduced in the fully connected layer for training, and the value of Dropout is selected as 0.5 in this study.It is proved through experiments that the Dropout layer solves the overfitting problem of the network by averaging multiple networks and reducing the complex co-adaptive relationship between neurons.

Loss Function and Accuracy and Recall

The loss function is used to evaluate the degree of error between the predicted value and the true value of the model, and the better the loss function, the better the performance of the model usually is. The loss function of this model chooses the cross-entropy loss function.

The evaluation criteria of this study selects accuracy rate, recall rate and F1 value to evaluate the model results.

The accuracy rate is the ratio of the number of samples correctly classified by the classifier to the total number of samples for a given test dataset, which is calculated by the formula (13) as: ACC=TP+TNTP+TN+FP+FN

Recall is the ratio of the number of relevant documents retrieved to the number of all relevant documents in the document repository, and measures the search rate of the retrieval system. Its calculation formula (14) is: Recall=TPTP+FN

The F1 value can be viewed as a kind of reconciled average of the model accuracy and recall. Its calculation formula (15) is: F1=2*ACC*RecallACC+Recall

Experimental results and analysis of the model for recognizing rhetorical devices
Comparison of explicit metaphor recognition performance of different algorithms

The comparison of experimental results (%) of different models on CSR is shown in Table 2. As can be seen from the table, the model proposed in this paper achieves the best results in terms of accuracy rate, which is 1.95% higher than the previous HGSR for the explicit metaphor recognition SOTA task, and slightly lower than the HGSR in terms of the F1 value; however, this paper’s method does not use complex network structure and does not add additional input-output features, and it can capture the semantic information of the sentences in depth, thus obtaining a better recognition performance, which verifies the effectiveness of the algorithm.Meanwhile, the results show that using BERT as a feature extractor for the explicit metaphor recognition task has a greater performance improvement than using word2vec to obtain word embeddings, which is attributed to BERT’s powerful dynamic textual characterization ability, and greatly mitigates the problem that in the Chinese explicit metaphor recognition task, due to the problem of multiple meanings of a word, the semantics of the text can not be accurately expressed, and it is not possible to perform the correct input sentence the problem of categorizing explicit metaphors.

The results of different models on CSR were compared to (%)

Model Precision Recall F1-score
SC 77.61 88.77 82.87
MTL-SC 80.77 92.09 86.3
Self_Attn+Pos 80.35 91.94 85.41
Cyc-MTL-SC 85.91 94.93 90.03
HGSR 89.29 94.79 91.68
Ours 91.24 91.22 90.73
Impact of Contrastive Learning Pretraining Data Size on Downstream Tasks

This experiment tests the effect of training the contrast learning module on the downstream task with unsupervised datasets of different scale sizes, 0 means no contrast learning module is added and only the explicit metaphor recognition module is available, and then the contrast learning pre-training is carried out using 10,000, 20,000, 50,000, 100,000, and 300,000 pieces of data, respectively. The effect of data size on the downstream task is shown in Figure 2. From the figure, it can be seen that the recognition performance of explicit metaphor sentences are all improved after applying the method of this paper for utterance recognition, which verifies the effectiveness of using the contrastive learning method.

Figure 2.

The impact of data size on downstream tasks

Comparison of Recognition Performance of Models for Different Sentence Lengths

In order to verify the effectiveness of using the method in this paper when solving sentence long distance dependency, we divided the sentences in the dataset into 6 groups according to the character length, which are 0-10 characters, 10-20 characters, 30-40 characters, 40-50 characters, and more than 50 characters, and tested the recognition performance of the proposed model compared with HGSR under different sentence lengths. The model’s recognition performance for different sentence lengths is displayed in Fig. 3. From the figure, it can be seen that as the sentence length increases, the proposed model maintains efficient recognition performance and is superior to HGSR.

Figure 3.

The identification performance of different sentence lengths of the model

Comparison of model performance with few samples

This experiment tests the performance of the proposed model in recognizing explicit sentences in the presence of data scarcity, where 20%, 40%, 60%, 80%, and 100% of the data volume in the dataset is selected for training. The comparison of model performance under few samples is shown in Fig. 4. As can be seen from the figure, compared with HGSR, the proposed model still has more than 70% F1 value when the amount of data in the training level is less, and the model achieves a high recognition performance at 40% of the training set’s data volume, which indicates that the proposed model still works well when there is a scarcity of data.

Figure 4.

Model performance comparison of less samples

Deep Semantic Network Analysis Model for Rhetoric in English Literary Texts
Word combination semantic relations

Semantic relationship needs to consider the combination and collocation relationship between words, and will comprehensively consider the characteristics of different types of text to establish collocation rules as well as related multiple words to form a combination of words to avoid the difficulty of expressing a complete semantic relationship after separation. The proposed rules include both generalized combination rules and domain-specific combination rules. Such as dependent on the evaluation of the object of the word meaning depends on the subject of the evaluation of the object, independent of the evaluation of the object can not be expressed in full, such as “design”, “camera function” and “China-US trade”, etc., and its Dependent syntactic relations satisfy the ATT, in which lexical relations satisfy the noun + noun (gerund), the front noun corresponds to the subject word, and the back noun (gerund) corresponds to the dependent word, and the dependent syntactic analysis and lexical annotation are shown in Fig. 5, in which the structures include the core (HED), subject-verb relationship (SBV), the structure in the middle (ADV), and the punctuation (WP), and the noun battery, range is a combinatorial relation.

Figure 5.

Dependency parsing and part-of-speech tagging

Deep Semantic Relationships

In order to improve the semantic understanding ability of the deep learning model and enhance the accuracy of sentiment analysis, the deep learning model of semantic attention will be constructed. Deep semantic cues mainly focus on low-frequency word association relationships that are not easy to find, and it is difficult for the deep learning model to recognize their relationships. Therefore, the synonym associations of evaluation objects are mainly considered, and the hierarchical structure of the Expanded Version of the Synonym Thesaurus can be utilized to obtain the synonym relationships between candidate evaluation objects.After obtaining the synonymous relationships of evaluation objects, a synonymous association network can be constructed, and each connected subnet corresponds to a clustering cluster of identical evaluation objects. Figure 6 depicts the network of synonyms association for evaluation objects.

Figure 6.

Synonymy network of evaluation objects

LSTM model design

In the LSTM model, deep semantic cues are elicited to generate a semantic attention mechanism, and the word vectors extracted from the topic model are used as LSTM input data, and the word vectors retain the topic clustering semantic relations.

LSTM model structure

The storage block structure of LSTM is shown in Figure 7. Deep learning model LSTM on the basis of neural networks, adding the input gate, forgetting gate and output gate, in the process of training and learning, the weights of each node can be dynamically modified, flexibly changing the integration degree at different points in time, to a large extent, the general neural network exists in the gradient of the gradient expansion or gradient disappearance and other problems.

Figure 7.

LSTM storage block structure

Semantic Embedding into LSTM Models

The following improvements are made on the basis of LSTM:

1) The high allocation rate of topic words after topic clustering in the form of vectors is used as the source data of LSTM, which can make LSTM associated to the semantic relations contained in the topic clustering, and improve the quality of the data source as well as the accuracy of the polarity classification to a large extent, relative to the direct introduction of the text data.

2) Introduce deep semantic cues into LSTM, participate in the training of weight parameters of LSTM, improve the attention of LSTM to the important semantic relations of words, merge the semantic attention in the hidden layer and derive the topic representations, and use softmax in the output layer to predict the sentiment classification of different topics.

Deep semantic network analysis of the rhetoric of English literary texts
Experimental and analytical results
Experimental results of metaphorical sentiment analysis

In this paper, the roles of metaphorical target domain, sentiment polarity and HowNet information in metaphorical sentiment analysis were verified through comparative experiments respectively, and the experimental results of different factors and their combinations in metaphorical sentiment analysis are shown in Table 3. Where F1(neg.), F1(pos.) and F1(avg.) denote the F1 values in negative, positive sentiment and average respectively. From it can be seen that:

1) a single attention mechanism has limited enhancement for metaphorical sentiment analysis. Compared to the LSTM model, the ATT-LSTM model has a smaller degree of enhancement in both Acc and F1 metrics after the introduction of the attention mechanism, and the F1(avg.) is only enhanced by 0.7%.

2) Metaphorical target domain information can significantly improve the results of metaphorical sentiment analysis. the TRAT-LSTM and MPA_MTC models, due to the addition of metaphorical target domain information, improved by 2.2% and 1.9% in Acc metrics, and gained 1.7% and 0.3% in F1(avg.) metrics, respectively, relative to the ATT-LSTM model.

3) The multipolar attention mechanism can significantly improve the metaphorical sentiment analysis with strong tendency. Relative to the TRAT-LSTM model, the MPA_MTC model adopts a multipolarity attention mechanism, which improves the F1(neg.) metric by 0.1%, and the F1(pos.) metric is not improved. The reason for this is that in the dataset, the positive emotions expressed are “good” and “happy”, among which the data labeled “good” accounts for 85% of all positive emotional data, and “good” is a weak positive emotion and is indirectly expressed through metaphors, and the improvement effect after the introduction of the multipolar attention mechanism is limited. But F(neg.) The emotional polarity of “evil”, “sorrow”, “fear” and “anger” contained in negative emotions was stronger, and the improvement effect was significant after the introduction of multipolar attention mechanism.

4) The HowNet semantic raw information can significantly improve the performance of metaphorical sentiment analysis. After using word embeddings incorporating HowNet ideographs in the MPA_MTC model, the Acc, F1(neg.), F1(pos.) and F1(avg.) metrics are improved by 1.9%, 0.6%, 1.8% and 1.6%, respectively. By comparing with the TRAT-LSTM model, the model in this paper improves 1.9% on the F1(avg.) metric, which proves the effectiveness of the method in this paper.

Different factors and the results of the experimental results

model Acc F1 (neg.) F1 (pos.) F1 (avg.)
LSTM 0.745 0.739 0.775 0.749
ATT-LSTM 0.754 0.737 0.76 0.756
TRAT-LSTM 0.766 0.768 0.773 0.77
MPA_MTC 0.776 0.778 0.776 0.773
Ours 0.795 0.784 0.794 0.789
LSTM weighting analysis

In order to intuitively compare the effect of multipolar LSTM mechanism and single LSTM mechanism in metaphorical sentiment analysis, this section compares the LSTM weights obtained from MPA_MTC and ATT-LSTM models during the training process, and the comparative analysis of MPA_MTC and ATT-LSTMLSTM weights is shown in Table 4. In order to visually present the differences between the LSTM weight information, this paper combines specific examples to reflect the size of the LSTM weight values through the color shades, the darker the color, the larger the LSTM weight.The “LSTM weights” column contains three different LSTM visualization results, (a), (b) and (c), where (a) represents the LSTM weight results of the ATT-LSTM model, and (b) and (c) represent the LSTM weight visualization results of the proposed method in this paper for the positive and negative emotion polarity vectors. Results.As can be seen from the results in the table: among the four examples, the weight visualization results in (a) found that the LSTM mechanism used by the ATT-LSTM model is difficult to find out the importance difference of different words in the sentence, which proves the hidden nature of the metaphorical sentiment from another side. Compared with the ATT-LSTM model, the model in this paper is able to focus on different aspects of the sentence under different sentiment polarity vectors, which also reflects the effectiveness of the idea of multi-head LSTM mechanism in metaphorical sentiment analysis.

MPA_MTC and ATT-LSTM attention weight contrast

Serial number Target word Model Visual LSTM weight visualization Forecast Actual
(1) Gong yuan ATT-LSTM Gong yuan li de hua jing xiang kai fang. Positive Positive
Ours Gong yuan li de hua jing xiang kai fang. Positive
(2) Gong si ATT-LSTM Gong si jue ding yin ru yi tao guan li xi tong. Negativity Negativity
Ours Gong si jue ding yin ru yi tao guan li xi tong. Negativity
(3) Tian qi ATT-LSTM Jin tian tian qi zhen hao. Positive Negativity
Ours Jin tian tian qi zhen hao. Negativity
(4) Wo ATT-LSTM Wo kan le yi bu dian ying. Positive Negativity
Ours Wo kan le yi bu dian ying. Negativity

Examples of LSTM weights are shown in Table 5. Due to the need to standardize the length of the input sequence during the training process, the length of the input sequence in this paper in the metaphorical sentiment analysis is 25, the example sentence “The company decided to introduce a management system” contains a total of five words, so when inputting into the model will be filled with “PAD” after the sentence.In the table, in (a), there is a difference in the LSTM weights of the words, but the difference is small, and the sum of the weights of the words is much less than 1, i.e., an LSTM mechanism alone cannot effectively focus on the key information, and the words and the filler symbol “PAD” have similar importance. In (b) and (c), there is a big difference in the weights of the words in LSTM, and the sum of the weights of the words is close to 1. Therefore, it can be found that the method presented in this paper can effectively identify the differences between different words in a sentence.

Example of LSTMweight value

Numbering LSTM weight
Gong si Jue ding Yin ru Yi tao Guan li xi tong.
(a) 0.067 0.063 0.053 0.031 0.045
(b) 0.361 0.049 0.495 0.035 0.052
(c) 0.52 0.067 0.297 0.043 0.055
Rhetorical Relationship Vector Distribution

In order to analyze the changes of the elemental value distributions of the rhetorical relation vectors during the training process of the model and their eventual impact on the model’s effectiveness, their value distributions are evaluated and analyzed in this paper. The vector distributions corresponding to the six rhetorical relations are shown in Fig. 8. The figure compares the Urel[r,:] corresponding to the 6 different rhetorical relations r. The data plots in the figure represent the distributions of values in the six coarse-grained rhetorical relation vectors obtained by training the RST-Stack-LSTM model without using data augmentation techniques, after 85 rounds of training.The values of the vectors corresponding to each rhetorical relation show a different distribution, which effectively demonstrates the necessity of incorporating rhetorical relations into a sentiment analysis system. For example, the distribution of values in the vectors correspond to the ATTRIBUTION and ELABORATION relations is significantly different.

Figure 8.

The vector distribution of six rhetorical relationships

Conclusion

In this study, we designed a rhetorical devices recognition model and a deep semantic analysis recognition model for English literary texts and tested them in experimental comparison. The conclusions of the article include: in the model performance comparison experiment with few samples, the proposed rhetorical devices recognition model still has more than 70% F1 value when the amount of data in the training level is small, and the model achieves a high recognition rate when the amount of data in the training set is 40%, which shows that the rhetorical device recognition model constructed in this paper can still have good results when there is a lack of text data.

By analyzing the LSTM weights of the example sentence “The company decided to introduce a management system” with the method of this paper, it is found that there are small differences in the LSTM weights of the words, and the sum of the word weights is far less than 1. Therefore, the method of this paper can effectively find the differences between different words in the sentence.

In summary, this study has certain practical application value and significance for the identification of rhetorical devices and textual sentiment analysis of English literary texts.

Funding:

This paper is the phased research result funded by “2024 Hunan University of Science and Technology Key Research Project on Teaching Reform (Project Number: XKYJ2024009); 2023 Hunan University of Science and Technology Degree and Graduate Education Reform Research Project (Project Number: XKYJGYB2304)”, and also is funded by the “Hunan University of Science and Technology Applied Characteristic Discipline Construction Project”.

Język:
Angielski
Częstotliwość wydawania:
1 razy w roku
Dziedziny czasopisma:
Nauki biologiczne, Nauki biologiczne, inne, Matematyka, Matematyka stosowana, Matematyka ogólna, Fizyka, Fizyka, inne