Pattern Recognition and Deep Semantic Network Analysis Techniques for Rhetorical Devices in English Literary Texts
Publié en ligne: 19 mars 2025
Reçu: 16 oct. 2024
Accepté: 09 févr. 2025
DOI: https://doi.org/10.2478/amns-2025-0540
Mots clés
© 2025 Jianfu Tang, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
With the continuous passage of time, English literature has gradually stepped into the mature stage, and formed a distinctive style, skills, to summarize, English literature mainly exists in the following aspects of the characteristics of the spoken and written differences, the expression of emotion is more extensive, and there is ambiguity in the language [1-4]. And the application of fuzzy semantics in English literature is more extensive, is also one of the characteristics of literature, in the process of reading the work can provide readers with greater imagination and richer reading experience, to enhance the charm of literary works [5-8].
Meanwhile, rhetorical devices have been widely used and promoted in English literature, playing an incomparable role and advantage, and have been highly valued and paid attention to by many English literature authors [9-10]. An excellent literary work can express the author’s own thoughts and feelings, so that the readers can understand and feel from it, and have the same thought resonance with the author [11-12]. Generally speaking, symbols, metaphors, anthropomorphism, humor and hyperbole belong to the important rhetorical devices, and each rhetorical device, its artistic characteristics and artistic style are unique, which can truly convey the inner world of the characters [13-14]. Therefore, authors must strengthen the application of rhetorical devices to enhance the cultural and intellectual connotation of their works, which plays an important role in promoting the level of English literature, ensuring that English literature obtains a high degree of satisfaction from the public, and upgrading English literature to a whole new level and depth [15-17].
The article firstly selects three rhetorical devices, namely, prose, metaphor and personification, for the construction of rhetorical recognition model, and then adopts the architecture of bidirectional multi-angle matching model for the construction of bidirectional multi-angle matching model based on convolutional neural network, and sets up and trains the model. Further, the article proposes a deep learning sentiment analysis model based on long and short-term memory, and constructs a deep learning sentiment analysis model integrating semantic knowledge from the perspectives of semantic extraction and deep semantic embedding to realize the effective sentiment classification of English literary texts. Finally, the complex network text sentiment analysis model based on LSTM proposed in this paper is compared with the experiments and the real results are shown.
The input layer chosen in this study is a fixed-length two-dimensional matrix, followed by simultaneous convolution operations with three convolution kernels of different sizes for local feature extraction of the input text, followed by a pooling layer to maximize the pooling operation of the input three types of feature maps to achieve local feature re-extraction [18-19]. The model’s parameter settings will be explained in detail below.
The requirement of the convolutional neural network model for the input layer is in the form of a two-dimensional matrix, so the input layer of this experiment is a two-dimensional matrix of the word vectors of metaphorical and anthropomorphic rhetorical data. To achieve the most realistic access to text meaning when extracting text features, all word vectors are stacked vertically.
The convolutional layer is the core layer of feature extraction in a convolutional neural network model, with the characteristics of local awareness, parameter sharing, and multiple convolutional kernels. When the input layer is fed with a two-dimensional matrix of text word vectors stacked vertically, the output of the feature is shown in Equation (1) if
Where,
The pooling layer is the output of the convolutional layer for the next step of compression and maximum feature extraction. The sampling methods of the pooling layer are: the maximum value pooling operation selects the maximum value of each block, and the mean value pooling operation selects the average value of each block. In this experiment, the maximum value pooling operation is selected, and the principle of pooling calculation is shown in Fig. 1.

Schematic diagram of pooling calculation
The fully-connected layer sends the 128 local features extracted at the top in the form of fully-connected to the Softmax classifier to consider and classify all the local features comprehensively, and uses the Dropout strategy to train, and finally builds the classification model to complete the rhetorical devices.
The dimension of the input layer in this experiment is 180*128, and the three convolution kernels designed in the convolutional layer are 3*128, 4*128, and 5*128, respectively, to obtain three different types of feature maps. Maximum pooling operation is used, and the size of the pooling kernel is chosen as 176*1, 177*1, and 178*1, respectively, and the fully connected layer feeds the 128 local features extracted in the forefront into the Softmax classifier in the form of fully connected to complete the classification of the data of platitude, anthropomorphism, and metaphoric rhetoric, and the parameters of the convolutional neural network model are shown in Table 1.
Convolution neural network model parameters
Model layer name | size | Algorithm/method |
---|---|---|
Input layer | 180*128 | Longitudinal stack |
Convolution layer | 3*128,4*128,5*128 | Relu activation function |
Pooling layer | 176*1,177*1,178*1 | Max-Pooling |
Full junction | 128 | The relu Dropout strategy |
In this study, a two-way multi-angle matching model is used, and the implementation process of the model is described in detail below.
The first step is the sentence representation layer. Firstly, the pre-trained word vectors are utilized to pair the input sentence S1, corresponding to a sequence of word vectors of [
In the second step, the text representation layer. The bi-directional long and short-term memory models are utilized to generate the hidden state sequences of sentences S1 and S2 as
Step 3, Matching layer. The matching operation is the core of the model, and its inputs are divided into two directions, S1→S2 and S1←S2, which is the meaning of the model’s “bi-directional”. Based on
The full-center matching approach selects the output of S2 at the last moment of the bidirectional long-short-term memory network as
The second, maximum pooling matching interaction. Maximum pooling matching selects the output of each moment in S2 as
The third, attention matching interaction approach. The attention matching approach starts with the cosine similarity of the output to
Subsequently, using
Subsequently,
Fourth, the maximum attention matching approach. It is similar to the attention matching approach, but the final selection is the maximum value instead of using summation. The formula is represented in (9):
Step 4, aggregation layer. Use the long and short time memory network for the two matching sequences respectively, and then connect the last time sequence vector of the long and short time memory network to get the final matching vector.
Fifth layer, prediction layer. Here the four matching vectors obtained from the aggregation layer are transformed into a probability distribution Pro(yS1|S2) using a fully connected layer and a softmax function to calculate the cross entropy with the labeled labels [20-21].
In this study, the model parameters are set and trained to complete the final model for rhetorical devices classification and recognition.
In order to enhance the expressive ability of the neural network, it is necessary to introduce a nonlinear function as the activation function. Currently, the commonly used activation functions mainly include Sigmoid function, tanh function and relu function.
The Sigmoid function is a commonly used nonlinear activation function, and the mathematical form (10) is:
The mathematical form of the tanh function (11) is:
The advantage is that the problem of not zero-mean output of the Sigmoid function is solved. The disadvantage is that the problem of vanishing gradients and the problem of power operations remain.
The mathematical form of the relu function (12) is:
Its advantage is that it solves the gradient vanishing problem in positive intervals, and it is faster to compute and converge than Sigmoid and tanh functions. In this study, the relu function is used for convolutional and fully connected layers.
The Dropout strategy is introduced in the fully connected layer for training, and the value of Dropout is selected as 0.5 in this study.It is proved through experiments that the Dropout layer solves the overfitting problem of the network by averaging multiple networks and reducing the complex co-adaptive relationship between neurons.
The loss function is used to evaluate the degree of error between the predicted value and the true value of the model, and the better the loss function, the better the performance of the model usually is. The loss function of this model chooses the cross-entropy loss function.
The evaluation criteria of this study selects accuracy rate, recall rate and F1 value to evaluate the model results.
The accuracy rate is the ratio of the number of samples correctly classified by the classifier to the total number of samples for a given test dataset, which is calculated by the formula (13) as:
Recall is the ratio of the number of relevant documents retrieved to the number of all relevant documents in the document repository, and measures the search rate of the retrieval system. Its calculation formula (14) is:
The F1 value can be viewed as a kind of reconciled average of the model accuracy and recall. Its calculation formula (15) is:
The comparison of experimental results (%) of different models on CSR is shown in Table 2. As can be seen from the table, the model proposed in this paper achieves the best results in terms of accuracy rate, which is 1.95% higher than the previous HGSR for the explicit metaphor recognition SOTA task, and slightly lower than the HGSR in terms of the F1 value; however, this paper’s method does not use complex network structure and does not add additional input-output features, and it can capture the semantic information of the sentences in depth, thus obtaining a better recognition performance, which verifies the effectiveness of the algorithm.Meanwhile, the results show that using BERT as a feature extractor for the explicit metaphor recognition task has a greater performance improvement than using word2vec to obtain word embeddings, which is attributed to BERT’s powerful dynamic textual characterization ability, and greatly mitigates the problem that in the Chinese explicit metaphor recognition task, due to the problem of multiple meanings of a word, the semantics of the text can not be accurately expressed, and it is not possible to perform the correct input sentence the problem of categorizing explicit metaphors.
The results of different models on CSR were compared to (%)
Model | Precision | Recall | F1-score |
---|---|---|---|
SC | 77.61 | 88.77 | 82.87 |
MTL-SC | 80.77 | 92.09 | 86.3 |
Self_Attn+Pos | 80.35 | 91.94 | 85.41 |
Cyc-MTL-SC | 85.91 | 94.93 | 90.03 |
HGSR | 89.29 | 94.79 | 91.68 |
Ours | 91.24 | 91.22 | 90.73 |
This experiment tests the effect of training the contrast learning module on the downstream task with unsupervised datasets of different scale sizes, 0 means no contrast learning module is added and only the explicit metaphor recognition module is available, and then the contrast learning pre-training is carried out using 10,000, 20,000, 50,000, 100,000, and 300,000 pieces of data, respectively. The effect of data size on the downstream task is shown in Figure 2. From the figure, it can be seen that the recognition performance of explicit metaphor sentences are all improved after applying the method of this paper for utterance recognition, which verifies the effectiveness of using the contrastive learning method.

The impact of data size on downstream tasks
In order to verify the effectiveness of using the method in this paper when solving sentence long distance dependency, we divided the sentences in the dataset into 6 groups according to the character length, which are 0-10 characters, 10-20 characters, 30-40 characters, 40-50 characters, and more than 50 characters, and tested the recognition performance of the proposed model compared with HGSR under different sentence lengths. The model’s recognition performance for different sentence lengths is displayed in Fig. 3. From the figure, it can be seen that as the sentence length increases, the proposed model maintains efficient recognition performance and is superior to HGSR.

The identification performance of different sentence lengths of the model
This experiment tests the performance of the proposed model in recognizing explicit sentences in the presence of data scarcity, where 20%, 40%, 60%, 80%, and 100% of the data volume in the dataset is selected for training. The comparison of model performance under few samples is shown in Fig. 4. As can be seen from the figure, compared with HGSR, the proposed model still has more than 70% F1 value when the amount of data in the training level is less, and the model achieves a high recognition performance at 40% of the training set’s data volume, which indicates that the proposed model still works well when there is a scarcity of data.

Model performance comparison of less samples
Semantic relationship needs to consider the combination and collocation relationship between words, and will comprehensively consider the characteristics of different types of text to establish collocation rules as well as related multiple words to form a combination of words to avoid the difficulty of expressing a complete semantic relationship after separation. The proposed rules include both generalized combination rules and domain-specific combination rules. Such as dependent on the evaluation of the object of the word meaning depends on the subject of the evaluation of the object, independent of the evaluation of the object can not be expressed in full, such as “design”, “camera function” and “China-US trade”, etc., and its Dependent syntactic relations satisfy the ATT, in which lexical relations satisfy the noun + noun (gerund), the front noun corresponds to the subject word, and the back noun (gerund) corresponds to the dependent word, and the dependent syntactic analysis and lexical annotation are shown in Fig. 5, in which the structures include the core (HED), subject-verb relationship (SBV), the structure in the middle (ADV), and the punctuation (WP), and the noun battery, range is a combinatorial relation.

Dependency parsing and part-of-speech tagging
In order to improve the semantic understanding ability of the deep learning model and enhance the accuracy of sentiment analysis, the deep learning model of semantic attention will be constructed. Deep semantic cues mainly focus on low-frequency word association relationships that are not easy to find, and it is difficult for the deep learning model to recognize their relationships. Therefore, the synonym associations of evaluation objects are mainly considered, and the hierarchical structure of the Expanded Version of the Synonym Thesaurus can be utilized to obtain the synonym relationships between candidate evaluation objects.After obtaining the synonymous relationships of evaluation objects, a synonymous association network can be constructed, and each connected subnet corresponds to a clustering cluster of identical evaluation objects. Figure 6 depicts the network of synonyms association for evaluation objects.

Synonymy network of evaluation objects
In the LSTM model, deep semantic cues are elicited to generate a semantic attention mechanism, and the word vectors extracted from the topic model are used as LSTM input data, and the word vectors retain the topic clustering semantic relations.
The storage block structure of LSTM is shown in Figure 7. Deep learning model LSTM on the basis of neural networks, adding the input gate, forgetting gate and output gate, in the process of training and learning, the weights of each node can be dynamically modified, flexibly changing the integration degree at different points in time, to a large extent, the general neural network exists in the gradient of the gradient expansion or gradient disappearance and other problems.

LSTM storage block structure
The following improvements are made on the basis of LSTM:
1) The high allocation rate of topic words after topic clustering in the form of vectors is used as the source data of LSTM, which can make LSTM associated to the semantic relations contained in the topic clustering, and improve the quality of the data source as well as the accuracy of the polarity classification to a large extent, relative to the direct introduction of the text data. 2) Introduce deep semantic cues into LSTM, participate in the training of weight parameters of LSTM, improve the attention of LSTM to the important semantic relations of words, merge the semantic attention in the hidden layer and derive the topic representations, and use softmax in the output layer to predict the sentiment classification of different topics.
In this paper, the roles of metaphorical target domain, sentiment polarity and HowNet information in metaphorical sentiment analysis were verified through comparative experiments respectively, and the experimental results of different factors and their combinations in metaphorical sentiment analysis are shown in Table 3. Where F1(neg.), F1(pos.) and F1(avg.) denote the F1 values in negative, positive sentiment and average respectively. From it can be seen that:
1) a single attention mechanism has limited enhancement for metaphorical sentiment analysis. Compared to the LSTM model, the ATT-LSTM model has a smaller degree of enhancement in both Acc and F1 metrics after the introduction of the attention mechanism, and the F1(avg.) is only enhanced by 0.7%. 2) Metaphorical target domain information can significantly improve the results of metaphorical sentiment analysis. the TRAT-LSTM and MPA_MTC models, due to the addition of metaphorical target domain information, improved by 2.2% and 1.9% in Acc metrics, and gained 1.7% and 0.3% in F1(avg.) metrics, respectively, relative to the ATT-LSTM model. 3) The multipolar attention mechanism can significantly improve the metaphorical sentiment analysis with strong tendency. Relative to the TRAT-LSTM model, the MPA_MTC model adopts a multipolarity attention mechanism, which improves the F1(neg.) metric by 0.1%, and the F1(pos.) metric is not improved. The reason for this is that in the dataset, the positive emotions expressed are “good” and “happy”, among which the data labeled “good” accounts for 85% of all positive emotional data, and “good” is a weak positive emotion and is indirectly expressed through metaphors, and the improvement effect after the introduction of the multipolar attention mechanism is limited. But F(neg.) The emotional polarity of “evil”, “sorrow”, “fear” and “anger” contained in negative emotions was stronger, and the improvement effect was significant after the introduction of multipolar attention mechanism. 4) The HowNet semantic raw information can significantly improve the performance of metaphorical sentiment analysis. After using word embeddings incorporating HowNet ideographs in the MPA_MTC model, the Acc, F1(neg.), F1(pos.) and F1(avg.) metrics are improved by 1.9%, 0.6%, 1.8% and 1.6%, respectively. By comparing with the TRAT-LSTM model, the model in this paper improves 1.9% on the F1(avg.) metric, which proves the effectiveness of the method in this paper.
Different factors and the results of the experimental results
model | Acc | F1 (neg.) | F1 (pos.) | F1 (avg.) |
---|---|---|---|---|
LSTM | 0.745 | 0.739 | 0.775 | 0.749 |
ATT-LSTM | 0.754 | 0.737 | 0.76 | 0.756 |
TRAT-LSTM | 0.766 | 0.768 | 0.773 | 0.77 |
MPA_MTC | 0.776 | 0.778 | 0.776 | 0.773 |
Ours | 0.795 | 0.784 | 0.794 | 0.789 |
In order to intuitively compare the effect of multipolar LSTM mechanism and single LSTM mechanism in metaphorical sentiment analysis, this section compares the LSTM weights obtained from MPA_MTC and ATT-LSTM models during the training process, and the comparative analysis of MPA_MTC and ATT-LSTMLSTM weights is shown in Table 4. In order to visually present the differences between the LSTM weight information, this paper combines specific examples to reflect the size of the LSTM weight values through the color shades, the darker the color, the larger the LSTM weight.The “LSTM weights” column contains three different LSTM visualization results, (a), (b) and (c), where (a) represents the LSTM weight results of the ATT-LSTM model, and (b) and (c) represent the LSTM weight visualization results of the proposed method in this paper for the positive and negative emotion polarity vectors. Results.As can be seen from the results in the table: among the four examples, the weight visualization results in (a) found that the LSTM mechanism used by the ATT-LSTM model is difficult to find out the importance difference of different words in the sentence, which proves the hidden nature of the metaphorical sentiment from another side. Compared with the ATT-LSTM model, the model in this paper is able to focus on different aspects of the sentence under different sentiment polarity vectors, which also reflects the effectiveness of the idea of multi-head LSTM mechanism in metaphorical sentiment analysis.
MPA_MTC and ATT-LSTM attention weight contrast
Serial number | Target word | Model | Visual LSTM weight visualization | Forecast | Actual |
---|---|---|---|---|---|
(1) | Gong yuan | ATT-LSTM | Gong yuan li de hua jing xiang kai fang. | Positive | Positive |
Ours | Gong yuan li de hua jing xiang kai fang. | Positive | |||
(2) | Gong si | ATT-LSTM | Gong si jue ding yin ru yi tao guan li xi tong. | Negativity | Negativity |
Ours | Gong si jue ding yin ru yi tao guan li xi tong. | Negativity | |||
(3) | Tian qi | ATT-LSTM | Jin tian tian qi zhen hao. | Positive | Negativity |
Ours | Jin tian tian qi zhen hao. | Negativity | |||
(4) | Wo | ATT-LSTM | Wo kan le yi bu dian ying. | Positive | Negativity |
Ours | Wo kan le yi bu dian ying. | Negativity |
Examples of LSTM weights are shown in Table 5. Due to the need to standardize the length of the input sequence during the training process, the length of the input sequence in this paper in the metaphorical sentiment analysis is 25, the example sentence “The company decided to introduce a management system” contains a total of five words, so when inputting into the model will be filled with “PAD” after the sentence.In the table, in (a), there is a difference in the LSTM weights of the words, but the difference is small, and the sum of the weights of the words is much less than 1, i.e., an LSTM mechanism alone cannot effectively focus on the key information, and the words and the filler symbol “PAD” have similar importance. In (b) and (c), there is a big difference in the weights of the words in LSTM, and the sum of the weights of the words is close to 1. Therefore, it can be found that the method presented in this paper can effectively identify the differences between different words in a sentence.
Example of LSTMweight value
Numbering | LSTM weight | ||||
---|---|---|---|---|---|
Gong si | Jue ding | Yin ru | Yi tao | Guan li xi tong. | |
(a) | 0.067 | 0.063 | 0.053 | 0.031 | 0.045 |
(b) | 0.361 | 0.049 | 0.495 | 0.035 | 0.052 |
(c) | 0.52 | 0.067 | 0.297 | 0.043 | 0.055 |
In order to analyze the changes of the elemental value distributions of the rhetorical relation vectors during the training process of the model and their eventual impact on the model’s effectiveness, their value distributions are evaluated and analyzed in this paper. The vector distributions corresponding to the six rhetorical relations are shown in Fig. 8. The figure compares the Urel[r,:] corresponding to the 6 different rhetorical relations r. The data plots in the figure represent the distributions of values in the six coarse-grained rhetorical relation vectors obtained by training the RST-Stack-LSTM model without using data augmentation techniques, after 85 rounds of training.The values of the vectors corresponding to each rhetorical relation show a different distribution, which effectively demonstrates the necessity of incorporating rhetorical relations into a sentiment analysis system. For example, the distribution of values in the vectors correspond to the ATTRIBUTION and ELABORATION relations is significantly different.

The vector distribution of six rhetorical relationships
In this study, we designed a rhetorical devices recognition model and a deep semantic analysis recognition model for English literary texts and tested them in experimental comparison. The conclusions of the article include: in the model performance comparison experiment with few samples, the proposed rhetorical devices recognition model still has more than 70% F1 value when the amount of data in the training level is small, and the model achieves a high recognition rate when the amount of data in the training set is 40%, which shows that the rhetorical device recognition model constructed in this paper can still have good results when there is a lack of text data.
By analyzing the LSTM weights of the example sentence “The company decided to introduce a management system” with the method of this paper, it is found that there are small differences in the LSTM weights of the words, and the sum of the word weights is far less than 1. Therefore, the method of this paper can effectively find the differences between different words in the sentence.
In summary, this study has certain practical application value and significance for the identification of rhetorical devices and textual sentiment analysis of English literary texts.
This paper is the phased research result funded by “2024 Hunan University of Science and Technology Key Research Project on Teaching Reform (Project Number: XKYJ2024009); 2023 Hunan University of Science and Technology Degree and Graduate Education Reform Research Project (Project Number: XKYJGYB2304)”, and also is funded by the “Hunan University of Science and Technology Applied Characteristic Discipline Construction Project”.