Open Access

Prediction of the Natural Gas Compressibility Factor by using MLP and RBF Artificial Neural Networks

,  and   
Feb 24, 2025

Cite
Download Cover

Introduction

The accurate custody transfer of natural gas is a complex metering task that has always been the subject of metrological control under an international standard or a local regulation [6]. The proper selection of equipment and the development of a methodology for billing the metered quantities help to maintain a good relationship between all parties involved in custody transfer.

Natural gas is a complex mixture of different components [10]. Essentially, it consists of a mixture of hydrocarbons (mainly methane) and minimal amounts of non-carbon components such as nitrogen, hydrogen sulfide, carbon dioxide, etc. The physical properties of natural gas are a fundamental issue in the gas industry [18], [5]. Among the most important are the compressibility factor, the calorific value, and the energy parameters as well as the determination of the composition of the gas components.

The compressibility factor is denoted by the symbol Z and its value is determined by the equation of state (EOS) with the following expression: Z=VrealVideal=VPnRT, Z = {{{V_{real}}} \over {{V_{ideal}}}} = {{V \cdot P} \over {n \cdot R \cdot T}}, where Vreal is the volume at real condition, Videal is the volume at ideal condition, R is the universal gas constant, n is the number of moles of the gas, T is the absolute temperature, P is the absolute pressure, and V is the volume of the gas. The values of the coefficient of compressibility of natural gas are required for various engineering tasks such as pipeline design, gas metering and others. There are three ways to determine the values of the compressibility factor – experimental data, EOS, and empirical correlations.

The use of highly sensitive flow measuring devices is very important to ensure a high standard of natural gas distribution systems. The measurement of volume flow is carried out using various measuring instruments, such as ultrasonic-meters [7], turbine-meters [20], and rotary-meters [16]. In addition, the calorific value of the gas is another important parameter, which is calculated from the mole fraction of the individual components of the gas [28], [27]. For this reason, the gas operator must develop a method for converting the measured quantity of natural gas from volume units to energy units using an energy conversion device. The metering task is defined by the operation of several separate devices [13], [4] – flow computer (a volume conversion device); volume flow meter; natural gas chromatograph (a calorific value determination device); pressure transmitter; temperature transmitter.

In contrast to the volume flow measurement, where many different devices can be used to calculate the gas calorific value, the gas chromatograph [3] is most commonly used and preferred over calorimeters [27], [14], which are much simpler in design.

The process gas chromatograph (PGC) determines the physical composition of the natural gas on a molar basis, but it requires a high level of maintenance, such as the supply of carrier and reference gases and scheduled maintenance. All this makes the PGC an expensive asset, and in most cases the gas operator reduces the installation costs by reducing the installation locations.

The ultrasonic flow meter (USM) is the most widely used and reliable measuring device for industrial purposes. It can provide measurements of the speed of sound (SoS) parameter, which is also used for internal operational diagnostics. The SoS is a parameter that can be observed not only in the USM, but also in the other blocks of the system, such as the gas chromatograph, the pressure and temperature transmitters. It is influenced by the gas composition, the pressure and temperature, the geometry of the measurement section and the transit time measurement of the flow meter.

In recent years, data-driven approaches have been increasingly used to optimize and predict control systems and processes [25], [24]. These are alternatives to conventional techniques, based on artificial intelligence methods. Among them, artificial neural networks (ANN) are the most preferred models. ANNs have a number of advantages, such as approximation of dependencies and high accuracy in prediction [17], [26]. ANN models are often used to increase the accuracy of flow-meter measurements or to predict the calibration process.

Tianjiao Zhang has developed a convolutional ANN to determine the flow rate, analyzing the arrival time of the signal [29]. Based on deep learning, the constructed one-dimensional (1-D) network was verified with real data received from an USM in a pipeline. Santhosh and Roy created an optimized neural network that realizes adaptability in terms of pipe diameter, liquid density and temperature [22]. The output signal of the measuring device is a frequency that is converted into a voltage using a suitable converter. The implemented network avoids the need for re-calibration when changing different parameters.

A majority of the developed neural networks is based on multi-layer perceptron (MLP) architecture to evaluate the compressibility factor, calorific value and SoS of natural gas. Jingya Dong et al. implemented ANN for the evaluation of compressibility factor and SoS [8]. An MLP-ANN was created as four different types of training algorithms were used: Gradient Descent, Levenberg-Marquart, Conjugate Gradient Descent, and Bayesian Regularization. A multiple linear regression approach was used to create a model that includes three of the training methods.

An example of how the SoS value is dependent on the gas mixture is shown in study [15], in which measurements of natural gas with or without the addition of hydrogen (H2) were compared. The SoS is significantly higher when the hydrogen is part of the natural gas. This offers the possibility of accurately measuring the H2 content only by using ultrasonic gas flow meters.

The other type of neural network that is very often used successfully for predicting natural gas properties is radial basis functions (RBF)-ANN. They have identical characteristics and properties as MLP-ANN. Mohammad Hadi Shateri et al. [23] developed the Wilcoxon Generalized RBF-ANN to predict the compressibility factor of natural gas. An average relative error of 2.3 % was determined and the results were compared with various empirical correlations and equations of state. Elsayed et al. [9] predicted the compressibility factor based on 5490 datasets containing pseudo-reduced values of pressure and temperature. The study is carried out using RBF-ANN, Support Vector Machine and Functional Network. They obtained the best results for RBF-ANN with 0.99 correlation coefficient and 0.14% average absolute error.

The aim of this article is to investigate two ANN models based on MLP and RBF to predict the compressibility factor Z of natural gas. The dataset used includes measurement data from an USM of SoS, pressure P and temperature T. The models are evaluated against the key performance indicators. A comparison was made between the properties of the models created.

Subject & Methods
Measurement principle

It is characteristic of gases that ideal conditions are reached when the pressure approaches zero. Under real conditions, gases are characterized by a compressibility factor due to various intermolecular interactions. This parameter indicates the extent to which the real gas differs from the ideal gas at a given value of temperature and pressure. Of the thermodynamic parameters, the compressibility factor is an extremely important and critical parameter. There are numerous scientific studies in the literature based on different methods and correlations to calculate the compressibility factor, such as the Standing and Katz chart, the Dranchuk and Abou-Kassem correlation, etc., each of which has different advantages and disadvantages [19], [1].

Some of the main problems are related to the need for more computation time due to the increased complexity of the method or the increased error values in the data range.

The measurement principle is shown in Fig. 1. The USM operates by measuring the propagation times of ultrasonic pulses emitted with a specific SoS between two transducers (Fig. 1). The transducers are installed in a direct path, and both continuously alternate their positions as transmitter and receiver.

Fig. 1.

Ultrasonic flow measurement principle.

The volume flow rate at standard natural gas conditions can be determined using the following equation [16]: QS=K1uA, {Q_S} = {K_1} \cdot u \cdot A, where QS is the volume flow rate of the gas at standard conditions, K1 is the transforming coefficient, u is the flow velocity determined by USM, and A is the cross-section of the pipeline. The value of K1 can be determined by the following equation: K1=TbTfPfPb1Zf, {K_1} = \left( {{{{T_b}} \over {{T_f}}}} \right) \cdot \left( {{{{P_f}} \over {{P_b}}}} \right) \cdot \left( {{1 \over {{Z_f}}}} \right), where T stands for the temperature, P for the pressure, Z for the compressibility factor, and the indices f and b for the flow or the standard conditions.

The actual SoS in the gas under operating conditions can be calculated from the sum of the two measured propagation times (for the forward and opposite direction) of the path: SoS=Li21ttri+1treci, SoS = {{{L_i}} \over 2} \cdot \left( {{1 \over {{t_{t{r_i}}}}} + {1 \over {{t_{re{c_i}}}}}} \right), where SoS – speed of sound, L – path length, ttr – propagation time in the direction of flow, trec – propagation time against the direction of flow, i – current path.

The following formula applies to multipath ultrasonic meters: SoS=1ninSoSi, SoS = {1 \over n} \cdot \sum\nolimits_i^n {So{S_i}} , where n is the number of paths.

The flow velocity can be determined by the following equation: u=L2cosα1ttr1trec u = {L \over {2 \cdot cos \alpha }} \cdot \left( {{1 \over {{t_{tr}}}} - {1 \over {{t_{rec}}}}} \right)

The compressibility factor is defined in the ISO20765-2 standard. To determine its value, information about the components of the gas is required. Despite the many studies on the estimation of the compressibility factor, there is no general equation that is valid under all conditions. For this reason, alternative methods such as ANN are increasingly used, which offer more possibilities to study the relationships between the variables.

The selection of input parameters for the evaluation of the compressibility factor is based on the well-known relationships and correlations used in traditional methods. The compressibility factor calculated from empirical correlations is characterized by relatively low accuracy and simplified dependencies, which usually include temperature, pressure, and gas composition. The EOS approach has high accuracy because two equations of state have been created specifically for industrial purposes – AGA8 and GERG 2008. In these equations, temperature, pressure, and gas composition data are required to determine the compressibility factor.

In general, both methods, empirical correlations and EOS, require the same input parameters. When a USM is used in the measurement system, it is known from the propagation law of SoS that the ultrasonic velocity is different for the various components of the natural gas. For this reason, the SoS can be used to evaluate individual components and does not require the use of complex and expensive equipment to assess the composition of the gas.

In the present study, an intelligent approach was developed to determine the compressibility factor without the need for information about the composition of the gas.

Multi-layer perceptron model

The ANN is a computer-based tool for parallel processing of information, further classification and forecasting without the knowledge of the functional relationship between the input and output parameters – typical examples are given in [12], [30]. The ANN is executed by interconnected processing units called neurons. They are organized in layers and form the structure of the network. Depending on the location of the neurons, the layers can be divided into three types: input, hidden, and output.

The action of a neuron with the number j can be represented by the following equations [11]: yj'=i=1nwi,jui+bj, y_j^\prime = \sum\nolimits_{i = 1}^n {{w_{i,j}}} {u_i} + {b_j}, where yj is the output, wij is the weight of ui, and bj is the bias of the neuron with the number j. The activation function f is generally non-linear: yj=fxi=1nwi,jui+bj {y_j} = f\left( x \right)\left[ {\sum\nolimits_{i = 1}^n {{w_{i,j}}{u_i} + {b_j}} } \right]

The current study uses a feed-forward topology based on MLP to predict the compressibility factor Z using an ultrasonic flow meter. A Levenberg-Marquardt (LM) and a scaled conjugate gradient descent (SCGD) training algorithm were used to train the network. The structure of the MLP-ANN is shown in Fig. 2.

Fig. 2.

Schematic diagram of the MLP-ANN.

Radial basis function network design

RBF networks are widely used due to some important advantages, such as the relatively simple structures and the availability of faster training algorithms [22], [23]. The neural network architecture shown in Fig. 3 includes three layers – an input layer, a hidden layer and an output layer with feed forward algorithm. The RBF-ANN model is implemented with the activation function in the hidden layer depending on the distance between the input signal and a given central point of the neuron. The neurons in this layer usually have Gaussian transfer functions that have the following form [2]: φix=expxμi22σi2 {\varphi _i}\left( x \right) = \exp \left( { - {{{{\left\| {x - {\mu _i}} \right\|}^2}} \over {2\sigma _i^2}}} \right) where φi is the nonlinear function of element i, x is the input vector, μi is the center of element i, and σi2 is the spread of the Gaussian function in the direction of element i. The form of the output signal of the RBF-ANN is as follows: Ykx=i=1mwk,iϕix+wko {Y_k}\left( x \right) = \sum\nolimits_{i = 1}^m {{w_{k,i}} \cdot {\phi _i}\left( x \right)} + {w_{ko}} where m is the number of functions, wkj is the weight between basis functions and output, Φ is the nonlinear function of element i, and wko is the weight of the output layer. The training algorithm of the RBF network is defined by different spread numbers in the interval from 0 to 1.

Fig. 3.

Schematic diagram of the RBF-ANN.

The MLP-ANN and RBF-ANN studies were performed using MathWorks Matlab software.

Data collection

For the purposes of this research, a package of 151 days of daily average data was collected, processed, and used. The real data set includes 604 values of continuous operation of the measuring devices and was collected from a gas transmission station on the territory of Bulgaria. The measuring equipment includes a USM model SICK Flowsic600-XT Quatro, calibrated measuring range 1.00 ÷ 120.000 m3/h, relative error 0.2 %, pressure sensor system Rosemount 3051S, relative error 0.035 % and temperature sensor system Rosemount 3144P, relative error 0.25 %. Based on this and using an ANN, it was possible to predict the compressibility factor Z, which is a key parameter for calculating the volume flow at base conditions.

The experimental data for this research was divided into two parts – the first is the training set and comprises 70 % of the observations used to train the MLP and RBF networks. The second set consists of the remaining 30 % of the data for validation. The data forming the training set is necessary for modifying the weights of the connections between the neurons in the structure of the neural network. The validation set is used for the overfitting analysis to obtain an optimal model. This dataset provides an implementation of the “early stopping” technique, which stops the learning of the network, when the validation errors start to increase compared to the training error. This technique overcomes the problems of overfitting and underfitting of ANNs.

Performance indicators

The evaluation of the best models is based on standard performance indicators – correlation coefficient R2, root mean square error (RMSE), mean absolute error (MAE), and mean square normalized error (MSNE) [21]. The correlation coefficient R2 is estimated in an interval from 0 to 1. Values of R2 that are closer to 1 indicate a better model. In contrast, the error values for the better models should be close to zero. R2=t=1nytytMy^ty^tMnsforecSobs {R^2} = \left( {{{\sum\nolimits_{t = 1}^n {\left( {{y_t} - y_t^M} \right) \cdot \left( {{{\hat y}_t} - \hat y_t^M} \right)} } \over {n \cdot {s_{forec}} \cdot {S_{obs}}}}} \right) RMSE=MSE=t=1nyty^t2n RMSE = \sqrt {MSE} = \sqrt {{{\sum\nolimits_{t = 1}^n {{{\left( {{y_t} - {{\hat y}_t}} \right)}^2}} } \over n}} MAE=1nt=1nyty^t MAE = {1 \over n}\sum\nolimits_{t = 1}^n {\left| {{y_t} - {{\hat y}_t}} \right|} MSNE=1nt=1nyty^t1nyt1ny^t MSNE = {1 \over n}\sum\nolimits_{t = 1}^n {{{\left( {{y_t} - {{\hat y}_t}} \right)} \over {\left( {\sum\nolimits_1^n {{y_t}} } \right) \cdot \left( {\sum\nolimits_1^n {{{\hat y}_t}} } \right)}}} where n is the total number of observations, yt and ŷt are the predicted and observed values, ytM y_t^M and y^tM \hat y_t^M are the means of the predicted and observed values, Sforec and Sobs are the standard deviation of the predicted and observed values, respectively.

Results
Multi layer perceptron ANN model

The main problem in the development of the MLP architecture is the allocation of the hidden layers and the number of neurons in these layers. Many attempts have been made to determine the structure of the neural network. The main criterion was to achieve the best value of the correlation coefficient R2 and the lowest values of MSNE, RMSE, and MAE. Two training algorithms, LM and SCGD, were applied to investigate the number of hidden neurons.

The influence of the number of neurons in the hidden layer for the LM algorithm is shown in Fig. 4. The evaluation is compiled in the form of MSNE, RMSE, and MAE values. The best result of the MLP-ANN structure for the LM algorithm is obtained for 51 neurons in the hidden layer. The effect of the number of hidden neurons for the SCGD learning algorithm is shown in Fig. 5. The best values for MSNE, RMSE, and MAE are obtained for 56 neurons.

Fig. 4.

Effect of the number of hidden neurons of the MLP-ANN for the LM algorithm.

Fig. 5.

Effect of the number of hidden neurons of the MLP-ANN for the SCGD algorithm.

A comparative analysis of the LM and SCGD algorithms can be found in Table 1.

Comparative analysis of LM and SCGD algorithms.

Algorithm R2 MSNE RMSE MAE
LM 0.99032 0.0581 0.1206 0.087
SCGD 0.94229 0.0953 0.1543 0.1144

The table shows that the MSNE and RMSE errors for the LM algorithm are slightly lower compared to the SCGD algorithm, with the correlation coefficient significantly in favor of the first algorithm. MLP-ANN shows the best fit for the optimal number of hidden neurons – 51 for the LM algorithm. The developed three-layer neural network with a 3-51-1 topology and LM algorithm was selected for the next analysis. The input layer includes three neurons (nodes) to which the database of temperature, pressure and SoS was applied. The hidden layer comprises 51 neurons and the output layer is represented by 1 node for the output parameter – the compressibility factor.

A linear transfer function of the output neuron of the ANN structure was chosen. The training process is performed with a learning rate of 0.05 and a number of epochs determined after examining the training and validation errors. Different types of activation functions were also tested. In general, the best performance – highest R2 and lowest error – was obtained for ‘tansig’ in both the hidden layer and the output neuron. The results are shown in Table 2.

Tested combination of activation functions of MLP-ANN.

Activation function hidden layer Activation function output layer R2 MSNE RMSE MAE
tansig tansig 0.99032 0.0581 0.1206 0.087
tansig purelin 0.99219 0.3866 0.3109 0.2363
logsig tansig 0.94438 0.1034 0.1607 0.1072
logsig purelin 0.98062 0.6353 0.3985 0.3117
purelin tansig 0.82875 0.1136 0.1685 0.1184
logsig logsig 0.83831 0.2505 0.2502 0.1884
tansig logsig 0.85305 0.2536 0.2518 0.195
purelin logsig 0.68672 0.2955 0.2718 0.2067

After we ran a simulation with the optimal MLP-ANN architecture, the result of the regression analysis is shown in Fig. 6.

Fig. 6.

Scatter plot of predicted values versus observed values for MLP-ANN.

After running the simulation process through the 3-51-1 MLP-ANN, the optimal results obtained are: R2 = 0.99032; MSNE = 0.0581, RMSE = 0.1206, MAE = 0.087.

The results of the predicted versus observed values after the performed MLP simulation are shown in Fig. 7.

Fig. 7.

Plot of predicted values versus observed values for MLP-ANN.

The most appropriate selection of activation functions in the MLP layers was also performed. The tested combinations for the selected variants of the tansig, logsig and purelin functions are listed in Table 2.

The different activation functions tested show that three combinations have identical characteristics (logsig-purelin, tansig-tansig and logsig-tansig), which are close to tansig-purelin without reaching them.

The study of the prediction of the compressibility factor Z with MLP shows a very good accuracy of the results obtained, which is comparable to the values obtained by other researchers in this field.

Radial basis function ANN model

An RBF-ANN was developed with three input nodes for temperature, pressure and SoS and an output neuron in the third layer for the compressibility factor of the natural gas. The evaluation of the RBF-ANN was based on the same performance indicators: R2, RMSE, MSNE, and MAE. To obtain an optimal architecture of the network, experiments were performed with different numbers of neurons in the hidden layer and different spread numbers. The analysis includes testing the number of hidden neurons from 10 to 140 and the interval of spread number values between 0.1 ÷ 0.5. The experimental data used to train and test the RBF-ANN are the same as those used for the MLP-ANN. Table 3 shows the results on the influence of the number of neurons in the hidden layer and the different values of spread numbers. The best spread value obtained is SV = 0.1 for the number of hidden neurons HN = 140 and the performance indicators R2 = 0.99899, RMSE = 0.0135, MAE = 0.0075. Fig. 8 and Fig. 9 show a regression plot and a plot of the predicted versus the observed values of the compressibility factor for RBF-ANN.

Fig. 8.

Scatter plot of predicted values versus observed values for RBF-ANN.

Fig. 9.

Plot of predicted values versus observed values for RBF-ANN.

Influence of hidden neurons of RBF-ANN.

Spread value Neurons MSE MSNE RMSE MAE
0.1 140 0.99899 0.00073 0.0135 0.0075
0.3 140 0.99742 0.0019 0.0215 0.0108
0.5 140 0.99477 0.0038 0.0306 0.014

0.1 130 0.9973 0.0019 0.022 0.0141
0.3 130 0.99257 0.0053 0.0365 0.0181
0.5 130 0.99272 0.0052 0.0361 0.0177

0.1 120 0.9936 0.0046 0.0339 0.02
0.3 120 0.98875 0.0081 0.0449 0.0268
0.5 120 0.98833 0.0084 0.0457 0.0266
Discussion

The aim of this study is to apply a machine-learning approach to predict the compressibility factor of natural gas depending on the data of three input parameters: SoS, temperature, and pressure. The intelligent approach used is based on ANN with a hidden layer.

The developed neural networks based on the MLP architecture were evaluated for two learning algorithms: LM and SCGD. The results of the comparative analysis show the better performance of the LM algorithm. This is observed in the error values and the value of the coefficient of determination R2. The obtained results are in line with other researchers [8] who reported similar values for the parameters R2 (0.98 ÷ 0.99) and RMSE (0.1 ÷ 0.15).

The characteristics of MLP-ANN for different activation functions in the individual layers were investigated. Experiments were performed with different variants of the tansig, purelin and logsig functions. The results obtained show best values for two of the test variants - tansig-tansig and tansig-purelin. For the second combination, R2 has a higher value, but the error values are significantly higher. Despite the similar characteristics, it can be generally stated that the tansig-tansig combination shows the best behavior of MLP-ANN.

The effectiveness of the modeling process of the compressibility factor of natural gas by MLP-ANN and RBF-ANN was evaluated. The comparison of the models based on the investigated performance indicators R2, MSNE, RMSE and MAE was performed. The results of the analysis are shown in Table 4.

Comparison between MLP and RBF models.

Type ANN R2 MSNE RMSE MAE
MLP-ANN 0.99032 0.0581 0.1206 0.087
RBF-ANN 0.99899 0.000729 0.0135 0.0075

From the values obtained for the indicators, it can be summarized that they are quite high and very close to similar results obtained by other researchers using the same methods. The correlation coefficient R2 values are very close to each other. The RBF-ANN model, although with a minimal difference, has a higher value of R2 = 0.99899 compared to R2 = 0.99032 obtained by MLP-ANN. The obtained values of MSNE, RMSE and MAE show that the RBF-ANN model has better properties. The comparative analysis and the values obtained for the coefficient of determination R2 and errors are identical with the data presented by other researchers [8], [9].

Fig. 10 shows a comparison of the relative errors (RE) for the MLP-ANN and RBF-ANN models. It can be clearly seen that the relative errors for the two ANN models are quite different. The RE values for RBF-ANN are significantly lower compared to those for MLP-ANN for the analyzed area. It is interesting to note that the difference in values is several times greater. From the comparison between Table 4 and Fig. 7, Fig. 9 and Fig. 10, it can be concluded that the correlation coefficients R2 are very close to each other, while the difference in the MSNE, RMSE, MAE and RE errors is significantly in favor of the RBF-ANN model.

Fig. 10.

Comparison of relative errors of the MLP-ANN and the RBF-ANN model.

Conclusion

The current study presents a comparative analysis of two intelligent approaches based on ANN for modeling the compressibility factor of natural gas. Real data from sensors and devices in a gas distribution station on the territory of the Republic of Bulgaria were used for the study. The capabilities of MLP-ANN and RBF-ANN for predicting the Z-value are presented so that the results can be used for further calculation of volume flow measurement in the baseline condition.

The ANN approach shows very good abilities and characteristics of the developed models, which can be successfully used for the prediction of the compressibility factor of natural gas. From the results of the comparison of the two methods, it can be concluded that the RBF-ANN has better characteristics. The best values of R2 = 0.99899, MSNE = 0.000729, RMSE = 0.0135 and MAE = 0.0075 for RBF-ANN are similar to the values obtained by other researchers. From the analyses performed, it can be concluded that the better model was obtained by the RBF method.

The graphical interpretation of the comparison between the relative errors for the two models shows that the RBF-ANN model has a clear advantage. The error values RE are meny times smaller compared to those of MLP-ANN. The experiments carried out indicate that the RBF-ANN model describes the experimental dataset better, which is based on the better values of all indicators – the correlation coefficient and the errors.

In this study, the compressibility Z-factor of natural gas can be calculated from the values of three input variables: temperature, pressure and SoS, without the need for chromatographic analysis. The developed ANN is able to realize a high-quality prediction of the Z-factor of natural gas with sufficiently high accuracy by using only an ultrasonic flow meter.

Language:
English
Publication timeframe:
6 times per year
Journal Subjects:
Engineering, Electrical Engineering, Control Engineering, Metrology and Testing