1. bookAHEAD OF PRINT
Détails du magazine
License
Format
Magazine
eISSN
2444-8656
Première parution
01 Jan 2016
Périodicité
2 fois par an
Langues
Anglais
Accès libre

Study of agricultural finance policy information extraction based on ELECTRA-BiLSTM-CRF

Publié en ligne: 12 Dec 2022
Volume & Edition: AHEAD OF PRINT
Pages: -
Reçu: 11 Jul 2022
Accepté: 19 Nov 2022
Détails du magazine
License
Format
Magazine
eISSN
2444-8656
Première parution
01 Jan 2016
Périodicité
2 fois par an
Langues
Anglais
Introduction

As China’s agriculture has continued its transition to modern agriculture, agricultural finance has played a crucial role in promoting the rapid development of China’s agriculture. Because agricultural finance is a specific type of finance, very few commercial organisations have focused on investing in the agricultural sector for commercial purposes. Policy finance is therefore an important tool to assist in the development of agricultural finance, which is further complicated by the regional imbalance in agricultural development in the central and western regions of China as well as the east coast region. To maximise returns and avoid agricultural business risks, the eastern coast region utilises a wide range of financial instruments, including agricultural insurance, bank loans, etc. However, the central and western regions are affected by several factors including lack of knowledge of how to use financial instruments, making financial assistance imperative to improve the energy efficiency of agricultural production, which is very limited.

The current study explores the development imbalance and mitigation methods of agricultural finance in central and western regions of China, and it analyses the data from several counties and cities in China. First, He and Qu [1] investigated the specific consequences of policy finance on financial disincentives for an analysis of the role of policy finance on financial disincentives. The results indicated that the increase in social consumption levels with the support of policy finance mitigated 23.88% of the financial inhibition effect. Considering the causes of financial inhibition, Ding et al. [2] carried out a quantitative analysis of the factors affecting the financial restraining factor, which concluded that financial demand and financial risk are the main problems associated with agricultural finance in more backward regions, whereas agricultural finance in developed regions is mainly defined by total financial support. Nevertheless, the question of how can increase the awareness of using financial instruments in the more backward regions, by adequately avoiding financial risks, promoting the landing of financial instruments in western regions, and fully exploiting the potential of western agricultural development has been raised. Several recent research studies have been analysed by Wang Yunsheng [3] for agricultural protection subsidy policy effects, and it was found that the policy agricultural protection subsidy raised farmers’ incomes. By applying NLP tools to agriculture-related policy analysis, Li Qing and Qian Zaijian [4] used keyword and classification methods to analyse agricultural policies, demonstrating the shift in focus of agricultural policies over the years.

Nevertheless, current research lacks effective and rational methods to effectively interpret key information related to agricultural policies, resulting in a vague understanding of the positive effects of financial instruments in backward agricultural areas of China. The ineffective interpretation of agricultural finance policy support has prevented backward agricultural regions from benefiting from current policies and reduced the ability of farmers to use financial instruments. If we can read and understand the policies from the national level down to the township level, publicise and popularise the uses and risks of financial instruments, and promote the current policies support and insurance business, this will greatly aid in the development of the agricultural industry in Central and Western China. Consequently, farmers will experience less risk and greater profit, and at the same time, they can stay informed of the latest financial support policies according to policy changes, contributing to the agricultural development process at the grassroots level.

With the development of word vector techniques and deep learning, deep learning-based character recognition methods have achieved much better results [57]. Ren et al. [7] applied the idea of target detection to text detection, and found that the network designed for target segmentation was better for detecting English. Ayishathahira et al. [8] used deep learning models, such as convolutional neural networks (CNNs), bi-directional long and short-term memories (Bi-LSTMs) and conditional random fields (CRFs) to resolve CVs. In their paper, Liu et al. [9] proposed the RoIRotate method for directed text extraction, which combines detection and recognition into an end-to-end approach. Zu and Wang [10] proposed an approach based on neural network text classification and word vector for an end-to-end pipeline, which can combine upstream text block segmentation with downstream-specific information recognition. It was proposed by Junxian Zhou and Zhu Ruwei [11] that a BERT-BiLSTM-CRF architecture-based model could be utilised to extract structured information regarding business licences that would be highly generalisable and accurate.

It is the problem of named entity recognition in natural language processing to achieve structured extraction of key contents from the text information. As a result, neural networks and pre-trained models are widely used in natural language processing. This model has been widely used in various fields of natural language processing. The ELECTRA model performs better in small model conditions compared to the BERT model. Based on the potential of Chinese pre-training models for agricultural finance policy information extraction, this paper presents a model combining the ELECTRA model, bidirectional long- and short-term memory networks (BiLSTM) and CRFs for agricultural finance policy information extraction. To identify agricultural finance policy information, an ELECTRA-BiLSTM-CRF model is proposed to extract the deep features of textual information in various types of agricultural finance policies, which would realise the fast clarification and brief and accurate description of policy key points.

Extraction of agricultural finance policy information based on ELECTRA-BiLSTM-CRF

An overview of the general framework of information extraction for agricultural finance policy is provided in Figure 1. Before the extraction of information about agricultural finance policy, a knowledge graph is constructed of the triad of entities, relations and attributes related to agricultural finance, and the obtained text of agricultural finance policy is identified as pure text. Once the plain text has been identified as pure text, it is pre-processed and cleaned to remove special symbols. Finally, the ELECTRA-BiLSTM-CRF-based information extraction model of agricultural finance policy is used to extract and label key information from plain text data. Using ELECTRA-BiLSTM-CRF as an information extraction model for agricultural finance policies, the data are extracted and annotated with key information.

Fig. 1

General framework of information extraction programme for agricultural finance policy. CRF, conditional random field

In Figure 2, the ELECTRA-BiLSTM-CRF information extraction model is depicted, which is composed of three modules, such as the ELECTRA module, the BiLSTM module and the CRF module, respectively [13]. The first module is the ELECTRA-BiLSTM-CRF information extraction model. Initially, the text of the agricultural finance policy is fed into the pre-trained ELECTRA model, which generates word vectors of the text. This is then fed into Bi-LSTM to learn the contextual features of the agricultural finance policy. In the final step, the output sequence of BiLSTM is fed into the CRF layer, which solves the global optimisation sequence based on the state transfer matrix and the adjacent labels.

Fig. 2

Agricultural finance policy information extraction model architecture. CRF, conditional random field

ELECTRA

ELECTRA model [12] is an improved version of the dynamic pre-trained language model BERT, which is used to capture semantic information in agricultural finance policies. ELECTRA has greater learning representation, and since it employs two classifications of judges, it is not necessary to model the entire distribution of data, and it has greater operational efficiency and speed of convergence. Additionally, it can effectively improve the utilisation of information by using its own contextual predictions, which are appropriate for training a model with greater accuracy with fewer data samples. In addition to this, given the issue of possible Chinese character recognition errors during the OCR process, ELECTRA’s training for the generation of model replacement of masked word recognition is beneficial for improving the accuracy of the model in the case of incorrect recognition of words. When the model has error recognition words, entity annotation accuracy is improved.

Figure 3 illustrates the model structure of ELECTRA, which consists of a Generator and a Discriminator. The generator is used primarily to replace some words in a sentence, and the Discriminator is used primarily to determine whether each word in a sentence has been replaced, with the training process predicting all words, which is more efficient than BERT.

Fig. 3

ELECTRA model structure. MLM, Masked Language Model

The discriminator can easily determine whether the word has been replaced or not by using simple random substitution in ELECTRA. Therefore, ELECTRA trains the Generator using Masked Language Model (MLM) and then replaces the word with the Generator’s prediction. Eqs (1) and (2) show the prediction formulas for Generator Discriminator, respectively.

pG(xtx)=exp(e(xt)ThG(x)t)exp(e(xt)ThG(x)t)xexp(e(x)ThG(x)t) $$\matrix{ {{p_G}({x_t}\mid x) = \exp \left( {e{{({x_t})}^T}{h_G}{{(x)}_t}} \right){{\exp \left( {e{{({x_t})}^T}{h_G}{{(x)}_t}} \right)} \over {\mathop \sum \limits_{{x^\prime }} \exp \left( {e{{({x^\prime })}^T}{h_G}{{(x)}_t}} \right)}}} \cr } $$

Where: hG(x)t is the post-encoding vector; t is the location e(x) is the embedding of the word x.

D(x,t)=sigmoid(wThD(x)t)

The loss function of ELECTRA includes the loss function LMLM of the generator and the loss function LDiSC of the discriminator, which is formulated as shown in Eq. (3). The training loss function of the generator remains the MLM loss function because the generator replaces discrete words, which breaks the discriminator-to-generator gradient.

Loss=minθG,θDxXLMLM(x,θG)+λLDiSC(x,θD) $$\matrix{ {Loss = \mathop {\min }\limits_{{\theta _G},{\theta _D}} \mathop \sum \limits_{x \in X} {L_{MLM}}(x,{\theta _G}) + \lambda {L_{DiSC}}(x,{\theta _D})} \cr } $$

In the training process of the ELECTRA model, the weight sharing and joint training with Discriminator models are adopted, which increases the learning and recognition difficulty of the Discriminator step by step. Part of Discriminator.

With ELECTRA’s pre-trained model, it is possible to extract semantic information from text by training it using the MLM method. The semantic recognition training, however, may not be sufficient for the resolution of the problem of near-form word errors when recognising the acquisition of information on the source-side agricultural finance policy. To improve the information extraction abilities of the Discriminator in specific scenarios, the image pre-training model has been further added to the text generation process of the Generator during model training. The specific steps are as follows.

Upon completing the mask on the word, the replacement word sampling is derived from Eq. (4).

x^i1=pG(xixmasked)forim $$\matrix{ {\hat x_i^1 = {p_G}\left( {{x_i}\mid {x^{masked}}} \right){\mkern 1mu} {\rm{for}}\>i \in m} \cr } $$

Where: m represents the set of masked words. On the other hand, the image samples corresponding to xi represent vectorised using a neural network image pre-training model to obtain v(xt), and Using the cosine similarity degree of the embedding vector, we can estimate the degree of similarity of the text in terms of glyphs.

sim(xi,xj)=v(xi)v(xj)v(xi)v(xj)

Based on the cosine similarity degree of the text corresponding to the image embedding, the zigzag character of xi is sampled from the entire font w and is denoted as x^i2=pV(xi|w)forim $$\matrix{ {\hat x_i^2 = {p_V}({x_i}|w){\mkern 1mu} {\rm{for}}\>i \in m} \cr } $$

In Generator, replacement samples are generated by replacing the samples of the generation model with zigzag characters with a certain probability to simulate the glyph error problem in OCR recognition.

x^i=δx^i1+(1δ)x^i2

Where: δ satisfies the 0-1 distribution. The generator can simulate semantic similarity as well as glyph similarity of the text through random substitution, enhancing the information mining capabilities of the Discriminator.

BiLSTM

The Bi-LSTM combines a forward Long Short Term Memory (LSTM) with a backward LSTM. A LSTM Network is one of the most widely used recurrent neural networks, capable of learning the long-term dependent information associated with agricultural finance policy text sequences. In the process of training using agricultural finance policy text, gradient explosion and gradient disappearance problems may occur, but LSTMs are capable of resolving these issues. An LSTM consists of a forgetting door, an input gate, and an output gate.

As shown in Eq. (8), the forgetting door selectively forgets the input information of the previous cell node.

ft=σ(Wf[ht1,Xt]+bf)

Where: ft represents the output value of forgetting door, Wf represents the weight matrix, ht – 1 represents the hidden layer state at the previous moment, Xt represents the input at the current moment, and bf represents the bias.

With the input gate, the input information of the current node is retained selectively, and the formula is shown in Eqs (9)–(11).

it=σ(Wf[ht1,Xt]+bi) C~t=tanh(WC[ht1,Xt]+bC) $$\matrix{ {{{\tilde C}_t} = \tanh ({W_C} \cdot [{h_{t - 1}},{X_t}] + {b_C})} \cr } $$ Ct=ftCt1+itC~t $$\matrix{ {{C_t} = {f_t} \cdot {C_{t - 1}} + {i_t} \cdot {{\tilde C}_t}} \cr } $$

Where: it represents the output value of the input gate, C~t ${\tilde C_t}$ represents the temporary state of the current cell node, and Ct represents the state of the current cell node.

As the output gate primarily outputs information from the current time node selectively, its equations are shown in Eqs (12) and (13).

ot=σ(Wo[ht1,Xt]+bo) ht=ottanh(Ct)

Where: ot represents the output value of the output gate and ht represents the state of the hidden layer at the current moment.

LSTMs are capable of effectively filtering and storing information from memory units and of capturing dependency relationships over a longer distance. However, problems arise when LSTM is used to model the text of agricultural finance policy: it can only extract information from the previous text and cannot encode information from backward to front. In the process of extracting agricultural finance policy information, the information of anterior and posterior texts helps to identify relevant information. In this paper, a word vector output from ELECTRA is used as an input to BiLSTM, and the forward and backward LSTMs are then used to determine hidden information of the anterior and posterior texts of agricultural finance policy texts, respectively, and then the two information are stitched together and sent to CRF for the determination of agricultural finance policy information.

CRF

As discussed previously, the random field (CRF) is primarily responsible for capturing the dependency relationship between the preceding and following labels, which correspond to the text of agricultural finance policy, and leveraging the global knowledge of label sequences to better predict the labels of agricultural finance policy entities. The CRF is a discriminative, probabilistic, undirected graph model that is commonly used to analyse and label sequential data. The maximum entropy model and the hidden Markov model have been combined in this model to achieve better results in sequence labelling tasks, such as word separation, lexical labelling and named entity recognition.

In this paper, we apply the linear chain CRF, which is the conditional probability distribution P(YX) model that fully satisfies the Markov property when both the observation sequence X and the state sequence Y are linear chains. Eqs (14) and (15) show the parameters for the linear chain CRF.

P(YX)=1Z(X)expk=1Kωkfk(Y,X) Z(X)=Yexpk=1Kωkfk(Y,X)

Where, the observation sequence X corresponds to the sequence of agricultural finance policy text in this paper; the output sequence Y corresponds to the entity label category of the agricultural finance policy text sequence correspondence; Z(X) represents the normalisation factor; f(·) represents the characteristic function; and ω represents the weight corresponding to the characteristic function. The conditional probability model is obtained through maximum likelihood estimation using a training set of agricultural finance policy texts P(YX). During the prediction of agricultural finance policy text, the dynamic programming algorithm is used to find the optimal output sequence that maximises the conditional probability P(YX) for a given sequence of agricultural finance policy text, to determine the entity labels that correspond to the agricultural finance policy text. An entity label is obtained for the text corresponding to the entity.

Experimental results and their comparative analysis
Data set

It is necessary to train and predict an information extraction model based on an annotated database of agricultural finance policies to extract information on agricultural finance policies. By utilising existing structured information on agricultural finance policies on the Internet, we can reduce the workload of manual annotation during the training set construction process. Based on the structured agricultural finance policy information data, we apply the BIO annotation method to automatically annotate the field information of each important aspect of the agricultural finance policy. Meanwhile, the collected texts of agricultural finance policy are categorised and stored separately. The model is first fine-tuned using large batches of automatic production data, and then fine-tuned twice using small batches of real agricultural finance policy data.

Evaluation indicators

This experiment uses three evaluation metrics: precision rate (precision), recall rate (recall) and F1 value (F1-score). The F1 value is a combined evaluation metric for precision rate and recall rate. Eqs (16) and (17) provide the formulas for precision and recall rates.

precision=TPTP+FP recall=TPTP+FN

Where, TP is the number of information entities that were correctly identified by the model, is the number of irrelevant information entities that were identified by the model, and FN is the number of relevant information entities that were not identified by the model.

Eq. (18) shows the formula for the F1 value (F1-score).

f1=2precisionprecision+recall

Experimental results

The experimental platform utilises an NVIDIA Tesla T4 graphics card with 32 GB of video memory and an Ubuntu 18.04 operating system. In the Electra model, the BERT model is used as the backbone, with 12 layers, 768 hidden units and 12 multi-headed attentions; the BiLSTM input dimension is 768 while the layer dimension is 128. By using the ELECTRA-based Chinese model initialisation weights, the model was fine-tuned in two stages based on both an automatically generated dataset and an OCR real dataset. Adam is the optimiser, the initial learning rate is 0.001 and the number ratio of the training set to the test set is 7:3.

To compare and evaluate the performance of the information extraction model of agricultural finance policy, we developed three models: BERT-BiLSTM-CRF, BiLSTM-CRF, and BiLSTM. One of these is the BiLSTM model which uses Word2Vec as input for pre-trained word vector recognition. With input Word2Vec, the BiLSTM-CRF model adds a CRF layer to BiLSTM. The BERT-BiLSTM-CRF takes advantage of the same neural network structure as that of the proposed model, as well as the BERT-base Chinese pre-training model for initialisation.

Figure 4 illustrates the training of the four models in 100 rounds of epochs. As shown in the figure, the F1 scores of the two models that use the BERT model for word embedding are significantly higher than those of the two models that use Word2Vec for word embedding. Due to the same neural network structure used by both ELECTRA and BERT, the convergence speed of the two models using word embedding is essentially the same, and ELECTRA showed a slightly better accuracy than BERT.

Fig. 4

Training of different models. CRF, conditional random field

Figure 5 illustrates the performance of the four agricultural finance policy information extraction models on the test set of the agricultural finance policy F1-score. Compared to BiLSTM, BiLSTM-CRF and BERT-BiLSTM-CRF, the BERT-BiLISTM-CRF has the highest F1 score on the test set for the agricultural finance policy. Contrary to the BERT-BiLISTM-CRF, which is based on the BERT model, the model proposed in this paper shows better performance, indicating that the pre-training task developed for the problem of agricultural finance policy labelling is more appropriate.

Fig. 5

Testing of different models. CRF, conditional random field

Figure 6 illustrates the evaluation results of the four models when applied to the same experimental environment and training parameters. According to Figure 6, both models using ELECTRA for word embedding outperform those using BERT or those without BERT. The accuracy rate of the agricultural finance policy information extraction model after 100 rounds of training is 92.02%, the recall rate is 88.31%, and the F1 value is 89.73%, and all the indicators have reached high levels. The results of the study show that when building the agricultural finance policy information extraction model, the model has a stronger ability to extract features as well as higher information extraction accuracy.

Fig. 6

Evaluation results. CRF, conditional random field

Conclusion

To address the problem of obtaining structured information on key information of agricultural finance policies, this paper proposes an agricultural finance policy information extraction model combining the ELECTRA model, bi-directional long- and short-term memory network BiLSTM and CRF information extraction model. To achieve this, the ELECTRA pre-trained language model converts the plain text of agricultural finance policy into word vectors, which are then merged with the BiLSTM deep neural network to learn the textual contextual information and the CRF machine learning model to compute the global optimisation annotation sequence to produce the information of structured agricultural finance policy. In comparison to different models, the agricultural finance policy information extraction model can extract information about agricultural finance policies more effectively and has excellent information extraction characteristics.

Overall, the ELECTRA-BiLSTM-CRF could effectively identify agricultural finance policy information. For future work, the performance of information extraction of ELECTRA-BiLSTM-CRF will be improved and optimised.

Fig. 1

General framework of information extraction programme for agricultural finance policy. CRF, conditional random field
General framework of information extraction programme for agricultural finance policy. CRF, conditional random field

Fig. 2

Agricultural finance policy information extraction model architecture. CRF, conditional random field
Agricultural finance policy information extraction model architecture. CRF, conditional random field

Fig. 3

ELECTRA model structure. MLM, Masked Language Model
ELECTRA model structure. MLM, Masked Language Model

Fig. 4

Training of different models. CRF, conditional random field
Training of different models. CRF, conditional random field

Fig. 5

Testing of different models. CRF, conditional random field
Testing of different models. CRF, conditional random field

Fig. 6

Evaluation results. CRF, conditional random field
Evaluation results. CRF, conditional random field

He Zhixiong, Qu Ruxiao. Agricultural policy finance supply and rural financial disincentives - empirical evidence from 147 counties[J]. Financial Research, 2015(2):138-159. He Zhixiong, Qu Ruxiao. Agricultural policy finance supply and rural financial disincentives - empirical evidence from 147 counties[J]. Financial Research, 2015(2):138-159.Search in Google Scholar

Ding Zhiguo, Zhang Yang, Gao Qiran. Identification of rural financial factors affecting rural economic development based on regional economic differences[J]. China Rural Economy, 2014(3):4-13+26. Ding Zhiguo, Zhang Yang, Gao Qiran. Identification of rural financial factors affecting rural economic development based on regional economic differences[J]. China Rural Economy, 2014(3):4-13+26.Search in Google Scholar

Wang Yunsheng. Research on agricultural support and protection policy system in Jilin Province under the strategy of rural revitalization [D]. Jilin University, 2021. Wang Yunsheng. Research on agricultural support and protection policy system in Jilin Province under the strategy of rural revitalization [D]. Jilin University, 2021.Search in Google Scholar

Li Qing, Qian Zaijian. The distribution of attention and its logical interpretation of agricultural policy changes in China[J]. Journal of Huazhong Agricultural University: Social Science Edition, 2021(4):108-118+183. Li Qing, Qian Zaijian. The distribution of attention and its logical interpretation of agricultural policy changes in China[J]. Journal of Huazhong Agricultural University: Social Science Edition, 2021(4):108-118+183.Search in Google Scholar

Gao Youwen, Zhou Benjun, Hu Xiaofei. Research on convolutional neural network image recognition based on data enhancement[J]. Computer Technology and Development, 2018, 28(8):62-65. Gao Youwen, Zhou Benjun, Hu Xiaofei. Research on convolutional neural network image recognition based on data enhancement[J]. Computer Technology and Development, 2018, 28(8):62-65.Search in Google Scholar

Yadav V, Bethard S. A Survey on Recent Advances in Named Entity Recognition from Deep Learning models[J]. 2019. Yadav V, Bethard S. A Survey on Recent Advances in Named Entity Recognition from Deep Learning models[J]. 2019.Search in Google Scholar

Ren S, He K, Girshick R, et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[C]//NIPS. 2016. Ren S, He K, Girshick R, Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks[C]//NIPS. 2016.10.1109/TPAMI.2016.257703127295650Search in Google Scholar

Ayishathahira C, Sreejith C, Raseek C. Combination of Neural Networks and Conditional Random Fields for Efficient Resume Parsing[C]//International CET Conference on Control, 2018: 388-393. Ayishathahira C, Sreejith C, Raseek C. Combination of Neural Networks and Conditional Random Fields for Efficient Resume Parsing[C]//International CET Conference on Control, 2018: 388-393.10.1109/CETIC4.2018.8530883Search in Google Scholar

Liu X, Ding L, Shi Y, et al. FOTS: Fast Oriented Text Spotting with a Unified Network [J]. IEEE, 2018. Liu X, Ding L, Shi Y, FOTS: Fast Oriented Text Spotting with a Unified Network [J]. IEEE, 2018.10.1109/CVPR.2018.00595Search in Google Scholar

Zu S, Wang X. Resume Information Extraction with a Novel Text Block Segmentation Algorithm[J]. International Journal on Natural Language Computing, 2019. Zu S, Wang X. Resume Information Extraction with a Novel Text Block Segmentation Algorithm[J]. International Journal on Natural Language Computing, 2019.10.5121/ijnlc.2019.8503Search in Google Scholar

Zhou Junxian, Zhu Ruwei. A method for extracting structured information of business license using named entity recognition: CN112668335A [P]. 2021. Zhou Junxian, Zhu Ruwei. A method for extracting structured information of business license using named entity recognition: CN112668335A [P]. 2021.Search in Google Scholar

Ding Jiawei, Liu Xiaodong. ELECTRA-CRF-based text-named entity recognition model for telecommunication network fraud cases [J]. Information Network Security, 2021(6):63-69. Ding Jiawei, Liu Xiaodong. ELECTRA-CRF-based text-named entity recognition model for telecommunication network fraud cases [J]. Information Network Security, 2021(6):63-69.Search in Google Scholar

Devlin J, Chang M, Lee K, et al. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding [J]. Arxiv Preprint Arxiv:1810.04805, 2018. Devlin J, Chang M, Lee K, Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding [J]. Arxiv Preprint Arxiv:1810.04805, 2018.Search in Google Scholar

Articles recommandés par Trend MD

Planifiez votre conférence à distance avec Sciendo