A Study of Imagery Translation Strategies in English and American Poetry Aided by Natural Language Processing Technology
Publié en ligne: 19 mars 2025
Reçu: 27 oct. 2024
Accepté: 02 févr. 2025
DOI: https://doi.org/10.2478/amns-2025-0479
Mots clés
© 2025 Yanyan Lei, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
China’s translation of English and American poetry began in the middle of the 19th century, and in the following two hundred years, English and American poetry came in like a tidal wave. In the process of translation and dissemination, the old and the new cultural systems collided, and friction constantly restrained each other, weighed each other, and impacted and led the literary habits, values, and social trends at that time [1–2]. This was the case in the late Qing Dynasty and the early Republican period, and also in the new period, and the related translations, commentaries and researches were endless. In contrast, the translation of English and American poems in the seventeen years from 1949 to 1966 after the founding of the People’s Republic of China suffered a very cold reception. The reason for this is that the ideological constraints of this period were sufficient, and the political imprints were sensitive and deep, which emphasized the great specificity of the times [3–4]. Translation of British and American literature gave way to Russian and Soviet literature, and some important writers’ works were either rarely translated or avoided, while some writers who took second place in the British and American territories, on the contrary, attracted much attention and were emphasized in the translation [5–6].
The historical and social context of the construction of New Chinese culture has determined the unique nature and characteristics of the translation of English and American poetry in the past 17 years. Seventeen years of poetry translation in Britain and the United States contains a wealth of translation phenomena—from the selection of the original poems to the translation strategies adopted by the translators—therefore, the seventeen years of poetry translation, combined with the specific historical and cultural context of the time, and under the literary rules that were vigorously promoted at that time, such as “political standards first, artistic standards second”, constitute a relatively singular and functional poetry translation picture that eliminates differences[7–8].
With the trend of world integration, communication between different languages and cultures, as well as the international academic circle, is getting closer and closer [9]. In this process, people will inevitably come into contact with a large amount of information in non-native languages, and when information seekers are faced with this kind of information, the rudiments of non-native languages and the intricacies of network information bring great inconvenience to them, and it is difficult for them to get the useful information they need quickly because reading through the huge amount of pages of information will definitely consume a lot of time and energy [10–12].
The booming development of artificial intelligence technology has made natural language processing techniques present a good answer for solving such problems. The field of Natural Language Processing (NLP) originated about 50 years ago with machine translation systems, and the technology is used for automatic processing analysis and presentation of human natural language [13–14]. Currently, the field of Natural Language Processing includes various linguistic theories, cognitive models, and engineering approaches. Nowadays, millions of web pages can be processed in less than 1s by this technology [15–17]. Among the various natural language processing techniques, an important branch of them, information extraction techniques, can be used to extract the information needed or specified type of information by the user based on natural linguistic features from unstructured or semi-structured text (e.g., web pages, new media, forums, news, academic literature, etc.), such as new media information, forum web pages, news, and literature resources. Information (e.g., time, place, people and events, attribute relationship, purpose, and conclusion, etc.) needed or specified by the user, and converting the unstructured text into structured information by integrating and merging and splicing, removing redundancy elimination, and noise processing and other techniques [18].
In this paper, based on natural language processing technology, we design a strategy for translating English and American poetry imagery with a machine translation model that incorporates chapter context validity recognition as the core and quantitatively analyze English and American modern poetry, as well as test the translation performance of the model. The model in this paper, i.e., the original model based on Transformer, uses the same encoder to encode the source utterance and its contextual utterance and adds a multi-head contextual attention layer, a residual network with a gating mechanism, and a loss function based on joint learning. Finally, the performance of the model is verified using the BLEU metrics to quantitatively analyze English and American poetry and to perform a deep semantic statistical analysis of its imagery.
In the process of building a corpus of English and American poetry imagery, the first step is to collect and preprocess the corpus of English and American poetry imagery.
When collecting, organizing, editing, and proofreading the corpus, it should be done according to the actual application and ensure the professionalism and accuracy of the corpus. Although the corpus needs a large amount of corpus resources to support it, the quantity is not the only key; it should be accurate and avoid redundant and unnecessary corpus. The imagery corpus of English and American poetry in this paper is mainly obtained from publicly released collections of English and American poetry and translated versions of English and American poetry in different languages on relevant poetry websites.
The flow chart of the corpus collection in this paper is shown in Figure 1. In the construction of the English and American poetry imagery resource base, the corresponding data and information content are collected by the method of corpus acquisition and are saved, processed, and managed. Thus, a database system of English and American poetry imagery resources can be saved and managed in accordance with the standard requirements of a database in a computer system.

Corpus collection process
In this paper, the web crawler is used to obtain English and American poetic imagery information from the Internet. Web crawlers, also called web spiders, automatically obtain the required resources from web pages according to certain resource acquisition rules. Types of web crawlers: batch-type web crawlers, which are suitable for crawling with a clear target and scope, stop the web crawler immediately after acquiring the set target. An incremental web crawler, different from batch type web crawler, will continuously acquire the web content to ensure that the acquired data is the latest and shorten the acquisition time and space. A generalized web crawler, also known as the whole web crawler, obtains a huge range and data volume, has higher requirements for access space and crawling speed, and adopts parallel work mode, which takes a long time to update the page. In this paper, manual collection and web crawler are used to obtain the required corpus of English and American poetry imagery corpus.
The web crawler acquires an initial URL address, acquires a new URL address based on the initial URL, saves the data acquired from the web page in the database, and stops acquiring when the crawler goal is met. The steps of the web crawler in this paper are as follows: The web crawler sends a request to the crawler engine to get an initial URL address. Receive the crawled URL address and collect web page data. Refresh the URL address, send it to the acquisition module, and send the collected data to the storage module for data storage. Send the data to Spider for data processing and de-duplication. Repeat the above steps until all the required data related to English and American poetry imagery are acquired.
In order to solve the problem of inconsistent or garbled text format of the acquired corpus, corpus processing is needed to unify the collected text into the same data format for analysis and collection. Corpus processing mainly includes denoising, coding, and categorization. Denoising refers to the elimination of unwanted elements in the text, such as confusing formatting, illegible characters, broken lines, and unrecognizable graphical elements. Coding refers to the conversion of various texts into texts with the same coded format. Classification is able to facilitate the collection and maintenance of the imagery corpus of English and American poetry, especially when constructing a large corpus of English and American poetry imagery. It is more necessary to categorize and save. The denoising completed corpus is stored in the database, and each line is numbered according to the line standard, with additional sentence numbers for easy modification.
The preprocessing of the original (English) corpus of English and American poetry imagery mainly involves the removal of web tags and the extraction of plain English text. The first step of processing the original text corpus: the corpus acquired on the web is judged whether there are web tags and whether the content is duplicated or not. The second step is to batch process the acquired corpus to clear the redundant spaces, garbled codes, broken lines, unrecognizable graphics, etc., and further unify the text layout and formatting. Finally, we realize the preprocessing of the original text corpus, which lays the foundation for the construction of the corpus in the later stage.
The preprocessing of a corpus of English and American poetry imagery translations is a key step in building a corpus of English and American poetry imagery. Before that, the first task is to convert the massive textual data of multilingual English-American poetry translations with different encoding inputs into unified Unicode-encoded textual data.
Under the national and international character encoding standards, the English and American poetic imagery translations are encoded in order to realize the unity of the encoding method. The unification of English and American poetic imagery translation encoding is mainly to judge the encoding of all the English and American poetic imagery translation corpus using the existing encoding conversion software to convert the non-Unicode standard encoding and finally store the text in “utf-8” format.
Since the corpus of English and American poetry imagery is collected through different channels and exists in many forms, the process of entering these corpora into the corpus needs to consume a lot of manpower and time to informally enter and save these collected corpora. The corpus construction framework of this paper is shown in Figure 2, and the steps of corpus construction are as follows: Input the original and translated texts of English and American poetry imagery into the designated text box. Translate the original text of English and American poetic imagery into the translated text using text conversion rules. Due to the potential inaccuracies of current translation platforms, human intervention was required to manually translate and correct certain words and phrases. Subsequently, the original English-American poetic imagery text and the translated statements were binned and saved into a database. In addition, the collected TXT files were stored in the same database. The database was saved, the words and sentences were queried, and when the query was prompted to be successful, it was possible to move on to the next corpus. If a corpus error is displayed or no corpus is found, it is necessary to proofread the corpus again or re-enter the corpus into the database.

Technical roadmap for corpus construction
In this paper, a certain size of multilingual word alignment corpus of English and American poetry imagery is constructed by using web crawlers and manual corpus collection. The features and characteristics of English and translated languages are analyzed. The corpus preprocessing of multilingual English and American poetry imagery will be carried out through the word alignment method, and a dictionary of English and American poetry imagery terminology will be constructed. A corpus management system for English and American poetry imagery will be designed to enter the collected multilingual English and American poetry imagery corpus, and the corresponding multilingual corpus will be obtained through querying.
With the rapid development of modern science and technology, there are more and more kinds of computer-aided translation software, and most of the translation plug-ins can improve the efficiency and quality of translation. But without translation plug-ins, even the best translation software cannot realize automatic translation. In this paper, real-time translation of English and American poetic imagery is realized through the use of translation plug-ins, which will facilitate free communication between researchers of English and American poetic imagery, improve the quality of translation of English and American poetic imagery, and lay a good foundation for the construction of a multilingual corpus of English and American poetic imagery. The translation plug-in and the computer program exist independently of each other, and the plug-in provides translation services for the program without affecting the normal operation of the program.
Unlike sentence-level neural machine translation, chapter-level translation requires consideration of translating on a chapter-by-chapter basis and making full use of the chapter context. Given source document
The research in this paper is about automatic translation of English and American poetic imagery based on a chapter-level neural machine translation model in a neural network framework, using a neural network model based on an encoder-decoder and attention mechanism.
One of the major drawbacks of using recurrent neural networks for neural machine translation is that the model needs to be trained word-by-word for the entire input sentence in a lengthy manner, a structure that is both time-consuming and limits parallelism. Attention mechanisms allow the model to use the context around the words and can be highly parallelized.
Multihead self-attention mechanisms are able to capture dependencies between words within the same sentence or between different sentences within the same chapter through matrix operations and assign the required context for the word or sentence based on the dependency weights [19].
Currently, the Transformer model is used to achieve optimal performance in sentence-level machine translation tasks [20]; a brief schematic of the Transformer model is shown in Fig. 3, and its main components are as follows: Word vector embedding layer This layer converts each input word into a word vector by means of an embedding algorithm. Word embedding occurs only in the encoder at the bottom layer of the model, and the other encoders receive the output of the encoder at the previous layer in the stack structure. Positional Encoding The Transformer model determines the position of each word or the distance between different words in a sequence by adding a position encoding vector to each word vector. Self-attention layer The self-attention layer allows the self-attention layer to focus only on the position before the decoded word in the output sequence at the time of decoding. In this paper, we still use residual linkage and layer normalization to process the data passed between sub-layers, and the actual output of sub-layers is as follows:
Where, Encoder-Decoder Attention Layer The encoder-decoder attention layer integrates the utterance representation output from the source language encoder into the decoder used to help translate the target utterance:
Fully connected feed-forward neural network layer The linear layer and the immediately following normalization layer (softmax) convert the floating point vectors output by the decoder into words:

Transformer model structure
In this paper, we compute the target-side representation using a decoder that stacks
Where
Moses is a statistical machine translation system that can automatically train translation models for any language pairs, requiring only the translation of parallel corpora. Meanwhile, the system provides a feature-rich script library. In this paper, we use the scripts in Moses to perform characterization or true case operations on the training set in English, German, and Spanish.
For the purpose of improving the performance of the neural machine translation model, all the training corpus used in this paper is sub-lexicalized, which can effectively reduce the length of the word list. The Byte Pair Encoding (BPE) algorithm encodes according to the byte pairs. The main purpose of adopting the BPE is to shorten the length of the word list of the translation model so as to reduce the parameters of the model. At the same time, it can reduce the frequency of occurrence of the un-landed words in the translation results. The main purpose of using BPE is to reduce the length of the translation model word list and thus reduce the model parameters. For uncommon strings such as unregistered words, they are iteratively disambiguated into common subword units. The operand is the number of iterations in the splitting process, and the maximum length matching is performed according to the cut granularity generated by the operand to re-split the words to form subword word lists.
In this paper, we build all models using Pytorch, an open-source Python machine-learning library based on Torch, which is widely used in the field of natural language processing. Pytorch supports dynamic neural networks, and compared to static Tensorflow, Pytorch can handle problems such as RNNs varying the length of time of the outputs more effectively.
This paper reports the computation of sentence-level BLEU (denoted as s-BLEU) and chapter-level BLEU (denoted as d-BLEU) via scripts. In addition to BLEU scores, this paper also reports Meteor scores for translation results.
In this paper, the quality of translation results is assessed by pronoun translation accuracy using a reference translation-based metric (APT).
For the purpose of testing whether the research method of this paper meets the research goal of translation modeling to solve discourse inconsistencies using chapter context. This paper uses a dataset to evaluate the discourse phenomenon of imagery translation in English and American poetry. The dataset contains four test sets on denotation, lexical consistency, and ellipsis (lexical changes and verb phrases). Each test set contains a set of comparison samples, including correct (positive) translations with correct discourse phenomena and incorrect (negative) translations. The goal of this paper is to determine whether the improved translation model is more inclined to generate correct translations.
The benchmark system model selected for this chapter is a sentence-level neural machine translation model based on Transformer. Transformer is a model based on the structure of codecs. The encoder and decoder are composed of multiple layers of identical network layers. Each encoding layer consists of a multi-head self-attention sublayer and a fully-connected feedforward neural network, and each decoding layer consists of a multi-head self-attention sublayer, a multi-head contextual attention sublayer, and a fully-connected feedforward neural network. In addition, in order to solve the problems of gradient explosion, gradient vanishing, and unstable training process, the Transformer adds residual connectivity and layer regularization techniques to each layer.
The architecture of the machine translation model proposed in this paper that incorporates chapter context validity recognition is shown in Figure 4. The basic idea is to encode the chapter-level context representation and incorporate the encoded context information into the decoder. For the consideration of computational cost and attention burden, the multi-encoder structure of the model in this paper adopts a shared encoder, i.e., the same encoder is used to encode the source utterance and its contextual utterance based on the original model of the Transformer. In order to effectively utilize the chapter context information, the system adds a classification module. The classifier will further process the two encoder outputs with the aim of determining whether the context is relevant to the current sentence from the encoded information. On the one hand, the classification task improves the expression of the sentence, and on the other hand, it combines the contextual attention mechanism and the gating mechanism to guide the model to utilize the more valuable contextual information. The chapter contextual attention layer is located between the mask self-attention sublayer and the contextual attention sublayer of the decoder to guide the decoder to be able to use the contextual information. A gating mechanism is introduced to further control the utilization of contextual information on the target side.

Transformer model incorporating text context validity recognition
Transformer-based sentence-level models are translated without considering the context, so encoder-encoded sentence representations are weaker in expressing relationships between sentences. In addition, the encoded output of chapter translation should be better in determining the relationship between sentences. Therefore there is a hypothesis that the better the performance of the chapter translation encoder, the better its encoded output should be in judging inter-sentence relations. Accordingly, the translation system in this paper adds a joint classification task: using the encoder output to determine whether the contextual information of the current sentence is actually valid. The improved classification performance not only indicates the enhanced representation of sentences in the system but also shows that the model is able to focus on contextual information more effectively during the learning process.
The core of classifier design is to build the training corpus. In order to improve the classification effect, the system in this paper sets the positive example as the sentence with the highest similarity to the current sentence in the chapter and the negative example as the sentence with the lowest similarity to the current sentence in the chapter. In this paper, the TF-IDF strategy and KL_div strategy are used to calculate the inter-sentence similarity, respectively. In the TF-IDF strategy, the TfidfVectorizer function in the Sklearn library is used to transform the text sentences into TF-IDF sentence representations
In the KL_div strategy, a pre-trained model (BERT) based on the Transformer’s bi-directional encoded representation is used to convert sentences into feature vectors and calculate the KL distance between the sentence representations. The KL_div strategy calculates the KL distance as shown in Equation (9):
Notations
The system constructs three computations for the classifier to capture the relational information between
Finally, the fully connected and Softmax layer outputs are used to determine whether context
In the decoder shown in Fig. 4, an additional multi-head contextual attention sublayer is added for fusing chapter context information. For convenience, the output of the self-attention layer at the decoding end is remembered as
It is easy to see that the residual network shown in Eq. (15) will fuse the acquired chapter context through global reception
Where
The loss function based on joint learning in this paper is divided into three parts, which contain the prediction of the NMT target, effective chapter context identification, and the importance of chapter context in residual networks with a gating mechanism
The loss function associated with the prediction of the NMT target end is:
Where
The relevant loss function for effective chapter context recognition is:
The relevant loss function for the importance of chapter context
Summarizing the three losses yields that the training objective for joint learning is:
In this paper, we adopt a two-step training method approach, which uses the idea of pre-training and additionally uses sentence-level parallel corpus data
In the first step, the fusion corpus
In the second step, the chapter-level parameters are trained on the chapter-level parallel corpus
In order to study the imagery characteristics of Anglo-American modernist poetry and the subsequent translation research, this paper organizes and establishes a corpus of Anglo-American modernist poetry with a total of 42104 words. Meanwhile, with the help of the corpus software AntConc, this study counts the high-frequency words of English and American modernist poetry texts, and the frequency of the words ranked in the top 30 is shown in Table 1.
High-frequency words of British and American modern poetry text (Top 30)
Sequence | Word | Frequency | Sequence | Word | Frequency |
---|---|---|---|---|---|
1 | the | 2477 | 16 | with | 289 |
2 | of | 1371 | 17 | are | 287 |
3 | and | 1355 | 18 | Not | 287 |
4 | a | 1084 | 19 | but | 277 |
5 | to | 1077 | 20 | on | 258 |
6 | in | 720 | 21 | one | 251 |
7 | I | 684 | 22 | They | 244 |
8 | it | 613 | 23 | at | 230 |
9 | is | 608 | 24 | you | 221 |
10 | that | 587 | 25 | we | 220 |
11 | was | 410 | 26 | his | 219 |
12 | for | 321 | 27 | have | 216 |
13 | as | 310 | 28 | all | 195 |
14 | he | 307 | 29 | an | 188 |
15 | be | 289 | 30 | or | 187 |
As can be seen from Table 1, the top five words in terms of word frequency in English and American modern poetry texts are the, of, and, a, and to, which are basically the same as the statistical results of word frequency of English corpus for English original texts. Among the high-frequency words ranked in the top 30, personal pronouns and possessive pronouns are quite a lot and are used more frequently. For example, the frequency of I is 684 times, which is ranked 7th. The frequency of use of it is 613 times, ranking 8th. In addition, among the top 100 high-frequency words, there are a number of real words worth noting in addition to imaginary words. For example, the nouns with high word frequency are time, day, people, etc., in which the frequency of time is 73 times, which is ranked 63rd. High-frequency words in adverbs include very, never, etc. Little also has a high word frequency, used 79 times, ranking 59th. Great appears 67 times, ranking 62nd. Little also has a high word frequency, used 79 times, ranking 59th. Great appears 67 times and is in 62nd place.
In order to study the subject matter of the corpus of Anglo-American Imagist poetry, this paper organizes and builds a reference corpus. The corpus is the English text of the classic British poetry collection, totaling 41623 words. The subject words ranked in the top 30 of the English and American modern poetry texts of the reference corpus are shown in Table 2.
The theme words of the British and American poetry text (Top 30)
Sequence | Word | Frequency | Topicality | Sequence | Word | Frequency | Topicality |
---|---|---|---|---|---|---|---|
1 | the | 2476.6 | 277.321 | 16 | school | 21 | 40.283 |
2 | is | 607.6 | 104.698 | 17 | or | 187.6 | 39.290 |
3 | we | 219.8 | 104.045 | 18 | grade | 18.9 | 37.907 |
4 | English | 51.1 | 98.177 | 19 | white | 25.9 | 32.884 |
5 | a | 1084.3 | 88.043 | 20 | out | 102.2 | 32.490 |
6 | one | 250.6 | 84.459 | 21 | through | 40.6 | 32.471 |
7 | are | 287 | 82.874 | 22 | social | 16.1 | 31.975 |
8 | water | 33.6 | 63.303 | 23 | don’t | 20 | 31.146 |
9 | our | 81.2 | 49.092 | 24 | its | 77 | 29.792 |
10 | students | 24.5 | 48.525 | 25 | snow | 14.7 | 29.457 |
11 | old | 38.5 | 45.868 | 26 | sun | 14.7 | 29.122 |
12 | like | 93.8 | 45.738 | 27 | black | 18.2 | 27.998 |
13 | sea | 23.8 | 44.718 | 28 | years | 37.8 | 26.154 |
14 | lake | 21.7 | 43.124 | 29 | down | 63 | 25.277 |
15 | up | 96.6 | 42.824 | 30 | Englishman | 14 | 24.883 |
By observing the top 30 words in Table 2, it can be found that when the British Classic Poetry Collection is used as the reference corpus, the high-frequency theme words of the English and American modern poetry texts as the observation corpus are: the is, we, English, a and so on. Among the theme words, nouns indicate natural scenery, such as water, sea, lake, snow, sun, etc., and words related to the country, society, and culture, such as English, Englishman, students, school, social, etc., which reflect the breadth of English and American modern poetry in terms of theme.
In addition, by looking at the top 100 subject words, it was also found that color words appeared more frequently. The color words located in the top 100 include white, black, green, blue, and so on. In addition, nouns indicating natural scenery include trees, sky, earth, fog, wind, sunlight, etc. Topic words related to topics include political, world, war, work, national, modern, American, moral, language politics, etc. The statistical analysis of the topic words helps us to objectively summarize the stylistic features of modern English and American prose.
In order to validate the effectiveness of the machine translation model incorporating chapter context validity recognition in the translation of English and American poetic imagery, the Anglo-German datasets TED, New, and Europarl are used for validation in this experiment, the datasets used. The data processing methods are consistent with previous work in the field in order to allow for fair comparisons. Moses is used to perform the participle processing, and OpenNMT is used to perform the BPE participle processing. Also, in order to verify the effectiveness of the proposed method on different language pairs, this chapter additionally tests it on the English-French dataset IWSLT.
Currently, BLEU is mainly used within the field of neural machine translation to evaluate the quality of translations. IBM proposed BLEU for the evaluation of machine translation tasks. Its general idea is accuracy: if the standard translation reference is given, the sentence generated by the neural network is candidate, the sentence length is n, there are m words in candidate appearing in reference, m/n is the formula for the 1-gram of BLEU. There are many more variants of BLEU. According to the n-gram can be divided into a variety of evaluation indicators. The common indicators are BLEU-1, BLEU-2, BLEU-3, and BLEU-4 four, where n-gram refers to the number of consecutive words n. BLEU-1 measures word-level accuracy, and higher-order bleu measures sentence fluency. The evaluation within this topic is performed using BLEU-4 in accordance with the standards commonly used in the field, and its calculation can be expressed as follows:
In this paper, we compare the experimental results of the proposed model with typical chapter-level translation models and sentence-level baseline models, where the sentence-level model uses the Transformer-base structure, and the chapter-level models include IMPR using an additional encoder, HAN using a hierarchical attention mechanism, SAN using a selective attention mechanism, and query-based QCN, and Flat-Transformer using a single encoder. The results of the model performance comparison experiments are shown in Table 3, where Snmt and CAEnc are the baselines of the sentence-level and chapter-level translation models implemented in this experiment, and the evaluation metric is BLEU.
Model comparison experiment results
Model | English-German | English-German | ||
---|---|---|---|---|
TED | News | Europarl | IWSLT | |
Snmt | 24.14 | 27.03 | 30.81 | 40.78 |
IMPR | 25.09 | 23.25 | 30.23 | -- |
HAN | 25.58 | 26.35 | 30.78 | -- |
SAN | 25.61 | 25.62 | 30.27 | -- |
QCN | 26.47 | 23.42 | 30.39 | -- |
Flat | 25.73 | 24.95 | 31.55 | -- |
CAEnc | 25.21 | 27.47 | 31.21 | 41.01 |
This method | 26.13 | 28.04 | 32.15 | 41.92 |
As can be seen from Table 3, the method in this paper achieves the expected performance, with a maximum improvement of +1.99 BLEU compared with the sentence-level baseline model and a maximum improvement of +0.94 BLEU compared with the chapter-level baseline model and achieves the optimal performance among a series of typical chapter-level translation models compared.
For the machine translation class of generative tasks, when fine-tuning based on the pre-trained model, the representation capability of the pre-trained model is more important compared to the generative capability. Meanwhile, the mask language model has been validated by the pre-trained models such as BERT and CeMAT. In order for the model to obtain the representation capability more effectively, applying the mask language model as an additional training task allows the encoder to obtain the representation capability faster and better.
In order to explore the impact of two key parameters in this method on the model performance, namely the mask ratio θ and the loss function weighting factor γ, supplementary comparison experiments were conducted, all of which were done using the English-German dataset of English and American poetry imagery translations. The results of the supplementary comparison experiments are shown in Fig. 5, where (a) and (b) denote the results of the comparison experiments for the parameters θ and γ, respectively, and (c) denotes the results of the comparison of the loss function values before and after the use of encoder-side supervised signals

Supplementary comparison experiment results
As can be seen in Fig. 5, the model performance is ultimately optimal when the mask ratio is set to 20%, and the loss function weighting factor is set to 6. In the experiments, it is observed that after incorporating the encoder-side supervised signal, the model can converge quickly in the early stages of training, and the additional training task allows the encoder to gain representation capability quickly.
Also, in order to test whether it is the contextual information that plays an effect in helping the mask recovery in this process, this paper is supplemented with experiments where this method is used on a sentence-level model, and the results of the experiments are shown in Table 4.
Experimental results on the sentence level model
Model | BLEU | Δ |
---|---|---|
Snmt | 24.14 | |
Snmt+ |
20.27 | -3.87 |
CAEnc + |
26.13 | 1.99 |
As can be seen from Table 4, there is a significant decrease in performance (ΔBLEU of -3.87) when the method is used on the sentence model, which proves that contextual information plays a key role in recovering the mask portion, and at the same time proves the effectiveness of the method on chapter-level translation.
In this paper, in order to study the subjective sentiments embedded behind the imagery of English and American poems and to help both humans and machines translate the imagery of English and American poems accurately, the deeper meanings are annotated based on the literal meanings labeled in conjunction with the reference books, and an attempt is made to analyze the way of contact between the literal meanings and the deeper meanings from the cognitive point of view.
In this paper, English and American poetic imagery are deeply labeled with artificial meanings for statistical reference. Examples of the top ten images and their deep meanings, ranked according to the number of deep meanings labeled, are shown in Table 5, where the higher the ranking, the more often the image is used to convey a deep meaning.
Some of the images of deep meaning and their deep meaning
Imagery | Artificial interpretation | Frequency | Artificial interpretation | Frequency |
---|---|---|---|---|
Moon | Think of | 16 | Think of one’s home | 4 |
Rain | Tears | 2 | Strong | 2 |
Floating clouds | The weather is unpredictable | 2 | Wanderer | 2 |
Tears | Sadness | 4 | Grief | 2 |
Green Mountain | Live in seclusion | 6 | Hometown | 2 |
Wind | Miserable | 2 | Fast | 2 |
Ape | Sorrowful | 5 | Sad | 2 |
Smoke | Decline | 2 | Dense | 2 |
Spring wind | Good time | 3 | Love | 2 |
Sunset | Farewell | 2 | Decline | 2 |
From these high-frequency images and their deep meanings, it can be seen that the emotional tone of the deep meanings of each image is mostly stable and related to the basic characteristics of the image. For example, the moon, the scene of looking at a common moon all over the world, often triggers the poet’s emotions, and thus the moon is associated with homesickness and nostalgia. Tears are the secretions of the glands when a person is in pain. Thus, tears are mainly associated with pain. Sunset is a symbol of dusk, and dusk gives people the feeling of loneliness and depression, so the deeper meaning of sunset also reflects the emotion of loneliness and depression. Because of this connotation, resulting in the same dusk but a more positive emotion of sunset appears less frequently, to a certain extent, reflecting the emotional tendency of English and American poetry.
On the whole, the deeper meanings of the imagery mostly reflect negative and painful emotions, and many of them are extremely rich in deeper meanings, which reflects the polysemy of poetry.
In turn, based on the deep meaning information, this paper hopes to find the things that carry these sentiments. Using manual interpretation for labeling, similar sentiments can be merged, and their corresponding imagery types can be counted and the top 15 high-frequency imagery types are shown in Table 6.
Top 15 high frequency imagery and corresponding imagery
Deep label | Frequency | Image type | Image example and frequency | |||||
---|---|---|---|---|---|---|---|---|
Human | 102 | 91 | Pretty eyebrows | 4 | Fair complexion | 4 | Flower | 4 |
Desolate | 62 | 53 | Yellow cloud | 4 | Withered grass | 4 | Fallen leaves | 4 |
Think of | 81 | 48 | Moon | 15 | Bright moon | 11 | Wild goose | 9 |
Lonely | 42 | 36 | Single light | 5 | Lonesome boat | 5 | Shadow | 3 |
Place | 35 | 32 | Green Mountain | 4 | Checkpoint | 4 | Nine levels | 2 |
Miserable | 31 | 31 | Snot | 2 | Flute | 2 | Frost | 2 |
Sorrowful | 43 | 31 | Tears | 6 | Broken intestine | 5 | Heartbroken | 5 |
Broad | 28 | 25 | Blue sky | 3 | Hirano | 3 | Greet famine | 3 |
Cold | 23 | 23 | Chick | 2 | Ice | 2 | Cold day | 2 |
Fact | 27 | 23 | Dust | 5 | Soot | 3 | Weapons of war | 3 |
Sad | 26 | 22 | Pipa | 6 | Moon | 6 | Ape | 4 |
Part | 23 | 20 | Rain | 4 | Cow hair | 2 | Willow | 2 |
Time | 24 | 20 | Spring wind | 4 | Shakedown | 4 | Clock | 2 |
Quiet | 19 | 18 | Wave | 2 | Chirping | 2 | Moon | 2 |
Beautiful | 16 | 16 | Prosperous flower | 2 | Flower face | 2 | Mountain light | 2 |
It can be seen from Table 6 that the same deep meaning can be expressed by things of different meanings, and through the ratio of image type and frequency, it can be seen that poets will choose different images to express the same deep meaning, except for the image of “Moon” which expresses longing and the image of “Tears” which expresses sadness, which has a high proportion, indicating that poets have concentrated convergence in the choice of images when expressing sadness and longing.
It can be seen that in English and American poems, a lot of rhetorical devices such as borrowing and simile are used, and the poet can refer to a person through many things, such as through the localization of a person, such as eyebrows and face, or through the things that have certain characteristics common to a person, such as the use of flowers to represent a person, and the rest of the ways, such as through the location, through the collateral references, etc. The imagery that expresses various emotional atmospheres is usually related to the attributes of the things themselves. The imagery of various emotional atmospheres is usually related to the attributes and characteristics of the things themselves. For example, “Yellow cloud” is often found in the desert, where people are rare and there is no soil, which often gives people a sense of decay.
This paper takes metonymy, a deep-meaning type of imagery, as an example to explore the distribution of deep-meaning types of imagery in English and American poetry. The reasons for metonymy are roughly as follows: first, the target domain is more abstract or difficult to describe, and the use of metonymy enables readers to intuitively understand the meaning of the poem. Secondly, it is necessary to highlight certain aspects of the target domain’s characteristics, and the use of metaphors can highlight its characteristics without hindering the coherence of the meaning. Thirdly, it is to be closer to the reader’s life and more life-like. Statistics on the more concentrated types of metaphors and the source and target domains of different metaphors are shown in Table 7.
Top 15 high frequency imagery and corresponding imagery
Metonymy type | Image type | Frequency | Source domain-Target domain | Frequency | Metonymy examples | |
---|---|---|---|---|---|---|
Part-whole | 25 | 46 | Location-Person | 10 | Pretty eyebrows | Belle |
Part-Ship | 11 | Ship | Set sail | |||
Component-Instrument | 5 | String | Musical instrument | |||
Category metaphor | 38 | 42 | Scene-View | 7 | Spring scenery | Falling flower |
Place-Refers to a place in general | 5 | Place | Paradise | |||
Person-Person | 5 | Nobleman | Distant friend | |||
Characteristic metaphor | 26 | 29 | Color-Object | 8 | Painting | Jadeite |
Characteristic-Group | 5 | Guards of honor | Army | |||
Characteristics-People | 3 | Fragrant | Belle | |||
Causal metaphor | 2 | 28 | Component-Reason | 16 | White hair | Aged |
Production metaphor | 5 | 11 | Materials-Tools | 7 | Canvas | Weapons |
Birds-Sound | 2 | Crying bird | ||||
Possessive metaphor | 5 | 21 | Clothing-People | 9 | Dress | Government officials |
Tools-People | 8 | Chief minister’s seal | Bureaucrat | |||
Transportation-People | 6 | Enemy troops | Nobility | |||
Container metaphor | 4 | 9 | Utensils-Beverages | 7 | Cup | Liquor |
Tools-Things | 2 | New cooking | Food | |||
Place metaphor | 27 | 40 | Landscape-Nonspecific location | 14 | Hometown | Fields and gardens |
Landscape-Specific location | 9 | Capital | Area | |||
Location-People | 12 | High buildings | Concubines | |||
Time metaphor | 9 | 11 | Tools-Time | 5 | Clock | Dusk |
Poultry-Time | 5 | Crow | Spring |
As can be seen from Table 7, whole-part metaphors are predominant in metaphors, and various characters particularly characterize the ontology of metaphors. Whole-part metaphors, category metaphors, and feature metaphors are all characterized by “replacing the big with the small and the small with the big,” i.e., the use of a typical part, a typical feature, or a typical member to realize the metaphors.
In this paper, based on the original model of Transformer, a machine translation model incorporating chapter context validity recognition is constructed, and based on the corpus, the design of English and American poetry imagery translation strategy is realized. Through the experimental analysis of the model, the following conclusions are drawn: The top five words in word frequency in English and American modern poetry texts are the, of, and, a, and to, which are basically consistent with the word frequency statistics of the English Corpus for English original texts. Among the high-frequency words ranked in the top 30, personal pronouns and possessive pronouns are used more frequently. When the British classic poetry collection is used as the reference corpus, its high-frequency theme words include the, is, we, English, a, etc. Among the theme words, nouns indicate natural scenery, such as water, sea, lake, snow, sun, etc., and words related to the country, society, and culture, such as English, Englishman, students, school, social, etc., which reflect the breadth of the theme of British and American modern poetry. Compared with the sentence-level baseline model, the model in this paper achieves a maximum improvement of +1.99 BLEU, while compared with the chapter-level baseline model, it achieves a maximum improvement of +0.94 BLEU and outperforms all other typical chapter-level translation comparison models. In addition, the model performance is optimized when the mask ratio is set to 20%, and the loss function weighting factor is set to 6. After combining the encoder-side supervised signaling, the model can converge quickly in the early stage of training, and the additional training task allows the encoder to gain representation capability quickly. The deep sense-emotional tone of high-frequency imagery in modern British and American poetry is mostly stable and related to the basic characteristics of the imagery. On the whole, the deep meanings of imagery mostly reflect negative and painful emotions, and many of them are extremely rich in deep meanings, which reflects the polysemy of poetry. Different kinds of things can express the same deep meaning, and the choice of imagery has a centralized convergence. In addition, in English and American poetry, there are a lot of rhetorical devices such as borrowing and simile. In the imagery metaphors of English and American poetry, whole-part metaphors are mostly used, and the ontology of the metaphors is especially the various characters.