Prediction of the Natural Gas Compressibility Factor by using MLP and RBF Artificial Neural Networks
Published Online: Feb 24, 2025
Page range: 1 - 9
Received: Jul 18, 2024
Accepted: Jan 08, 2025
DOI: https://doi.org/10.2478/msr-2025-0001
Keywords
© 2025 Neven Kanchev et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
The accurate custody transfer of natural gas is a complex metering task that has always been the subject of metrological control under an international standard or a local regulation [6]. The proper selection of equipment and the development of a methodology for billing the metered quantities help to maintain a good relationship between all parties involved in custody transfer.
Natural gas is a complex mixture of different components [10]. Essentially, it consists of a mixture of hydrocarbons (mainly methane) and minimal amounts of non-carbon components such as nitrogen, hydrogen sulfide, carbon dioxide, etc. The physical properties of natural gas are a fundamental issue in the gas industry [18], [5]. Among the most important are the compressibility factor, the calorific value, and the energy parameters as well as the determination of the composition of the gas components.
The compressibility factor is denoted by the symbol
The use of highly sensitive flow measuring devices is very important to ensure a high standard of natural gas distribution systems. The measurement of volume flow is carried out using various measuring instruments, such as ultrasonic-meters [7], turbine-meters [20], and rotary-meters [16]. In addition, the calorific value of the gas is another important parameter, which is calculated from the mole fraction of the individual components of the gas [28], [27]. For this reason, the gas operator must develop a method for converting the measured quantity of natural gas from volume units to energy units using an energy conversion device. The metering task is defined by the operation of several separate devices [13], [4] – flow computer (a volume conversion device); volume flow meter; natural gas chromatograph (a calorific value determination device); pressure transmitter; temperature transmitter.
In contrast to the volume flow measurement, where many different devices can be used to calculate the gas calorific value, the gas chromatograph [3] is most commonly used and preferred over calorimeters [27], [14], which are much simpler in design.
The process gas chromatograph (PGC) determines the physical composition of the natural gas on a molar basis, but it requires a high level of maintenance, such as the supply of carrier and reference gases and scheduled maintenance. All this makes the PGC an expensive asset, and in most cases the gas operator reduces the installation costs by reducing the installation locations.
The ultrasonic flow meter (USM) is the most widely used and reliable measuring device for industrial purposes. It can provide measurements of the speed of sound (
In recent years, data-driven approaches have been increasingly used to optimize and predict control systems and processes [25], [24]. These are alternatives to conventional techniques, based on artificial intelligence methods. Among them, artificial neural networks (ANN) are the most preferred models. ANNs have a number of advantages, such as approximation of dependencies and high accuracy in prediction [17], [26]. ANN models are often used to increase the accuracy of flow-meter measurements or to predict the calibration process.
Tianjiao Zhang has developed a convolutional ANN to determine the flow rate, analyzing the arrival time of the signal [29]. Based on deep learning, the constructed one-dimensional (1-D) network was verified with real data received from an USM in a pipeline. Santhosh and Roy created an optimized neural network that realizes adaptability in terms of pipe diameter, liquid density and temperature [22]. The output signal of the measuring device is a frequency that is converted into a voltage using a suitable converter. The implemented network avoids the need for re-calibration when changing different parameters.
A majority of the developed neural networks is based on multi-layer perceptron (MLP) architecture to evaluate the compressibility factor, calorific value and
An example of how the
The other type of neural network that is very often used successfully for predicting natural gas properties is radial basis functions (RBF)-ANN. They have identical characteristics and properties as MLP-ANN. Mohammad Hadi Shateri et al. [23] developed the Wilcoxon Generalized RBF-ANN to predict the compressibility factor of natural gas. An average relative error of 2.3 % was determined and the results were compared with various empirical correlations and equations of state. Elsayed et al. [9] predicted the compressibility factor based on 5490 datasets containing pseudo-reduced values of pressure and temperature. The study is carried out using RBF-ANN, Support Vector Machine and Functional Network. They obtained the best results for RBF-ANN with 0.99 correlation coefficient and 0.14% average absolute error.
The aim of this article is to investigate two ANN models based on MLP and RBF to predict the compressibility factor
It is characteristic of gases that ideal conditions are reached when the pressure approaches zero. Under real conditions, gases are characterized by a compressibility factor due to various intermolecular interactions. This parameter indicates the extent to which the real gas differs from the ideal gas at a given value of temperature and pressure. Of the thermodynamic parameters, the compressibility factor is an extremely important and critical parameter. There are numerous scientific studies in the literature based on different methods and correlations to calculate the compressibility factor, such as the Standing and Katz chart, the Dranchuk and Abou-Kassem correlation, etc., each of which has different advantages and disadvantages [19], [1].
Some of the main problems are related to the need for more computation time due to the increased complexity of the method or the increased error values in the data range.
The measurement principle is shown in Fig. 1. The USM operates by measuring the propagation times of ultrasonic pulses emitted with a specific

Ultrasonic flow measurement principle.
The volume flow rate at standard natural gas conditions can be determined using the following equation [16]:
The actual
The following formula applies to multipath ultrasonic meters:
The flow velocity can be determined by the following equation:
The compressibility factor is defined in the ISO20765-2 standard. To determine its value, information about the components of the gas is required. Despite the many studies on the estimation of the compressibility factor, there is no general equation that is valid under all conditions. For this reason, alternative methods such as ANN are increasingly used, which offer more possibilities to study the relationships between the variables.
The selection of input parameters for the evaluation of the compressibility factor is based on the well-known relationships and correlations used in traditional methods. The compressibility factor calculated from empirical correlations is characterized by relatively low accuracy and simplified dependencies, which usually include temperature, pressure, and gas composition. The EOS approach has high accuracy because two equations of state have been created specifically for industrial purposes – AGA8 and GERG 2008. In these equations, temperature, pressure, and gas composition data are required to determine the compressibility factor.
In general, both methods, empirical correlations and EOS, require the same input parameters. When a USM is used in the measurement system, it is known from the propagation law of
In the present study, an intelligent approach was developed to determine the compressibility factor without the need for information about the composition of the gas.
The ANN is a computer-based tool for parallel processing of information, further classification and forecasting without the knowledge of the functional relationship between the input and output parameters – typical examples are given in [12], [30]. The ANN is executed by interconnected processing units called neurons. They are organized in layers and form the structure of the network. Depending on the location of the neurons, the layers can be divided into three types: input, hidden, and output.
The action of a neuron with the number
The current study uses a feed-forward topology based on MLP to predict the compressibility factor

Schematic diagram of the MLP-ANN.
RBF networks are widely used due to some important advantages, such as the relatively simple structures and the availability of faster training algorithms [22], [23]. The neural network architecture shown in Fig. 3 includes three layers – an input layer, a hidden layer and an output layer with feed forward algorithm. The RBF-ANN model is implemented with the activation function in the hidden layer depending on the distance between the input signal and a given central point of the neuron. The neurons in this layer usually have Gaussian transfer functions that have the following form [2]:

Schematic diagram of the RBF-ANN.
The MLP-ANN and RBF-ANN studies were performed using MathWorks Matlab software.
For the purposes of this research, a package of 151 days of daily average data was collected, processed, and used. The real data set includes 604 values of continuous operation of the measuring devices and was collected from a gas transmission station on the territory of Bulgaria. The measuring equipment includes a USM model SICK Flowsic600-XT Quatro, calibrated measuring range 1.00 ÷ 120.000 m3/h, relative error 0.2 %, pressure sensor system Rosemount 3051S, relative error 0.035 % and temperature sensor system Rosemount 3144P, relative error 0.25 %. Based on this and using an ANN, it was possible to predict the compressibility factor Z, which is a key parameter for calculating the volume flow at base conditions.
The experimental data for this research was divided into two parts – the first is the training set and comprises 70 % of the observations used to train the MLP and RBF networks. The second set consists of the remaining 30 % of the data for validation. The data forming the training set is necessary for modifying the weights of the connections between the neurons in the structure of the neural network. The validation set is used for the overfitting analysis to obtain an optimal model. This dataset provides an implementation of the “early stopping” technique, which stops the learning of the network, when the validation errors start to increase compared to the training error. This technique overcomes the problems of overfitting and underfitting of ANNs.
The evaluation of the best models is based on standard performance indicators – correlation coefficient
The main problem in the development of the MLP architecture is the allocation of the hidden layers and the number of neurons in these layers. Many attempts have been made to determine the structure of the neural network. The main criterion was to achieve the best value of the correlation coefficient
The influence of the number of neurons in the hidden layer for the LM algorithm is shown in Fig. 4. The evaluation is compiled in the form of

Effect of the number of hidden neurons of the MLP-ANN for the LM algorithm.

Effect of the number of hidden neurons of the MLP-ANN for the SCGD algorithm.
A comparative analysis of the LM and SCGD algorithms can be found in Table 1.
Comparative analysis of LM and SCGD algorithms.
Algorithm | ||||
---|---|---|---|---|
LM | 0.99032 | 0.0581 | 0.1206 | 0.087 |
SCGD | 0.94229 | 0.0953 | 0.1543 | 0.1144 |
The table shows that the
A linear transfer function of the output neuron of the ANN structure was chosen. The training process is performed with a learning rate of 0.05 and a number of epochs determined after examining the training and validation errors. Different types of activation functions were also tested. In general, the best performance – highest
Tested combination of activation functions of MLP-ANN.
Activation function hidden layer | Activation function output layer | ||||
---|---|---|---|---|---|
tansig | tansig | 0.99032 | 0.0581 | 0.1206 | 0.087 |
tansig | purelin | 0.99219 | 0.3866 | 0.3109 | 0.2363 |
logsig | tansig | 0.94438 | 0.1034 | 0.1607 | 0.1072 |
logsig | purelin | 0.98062 | 0.6353 | 0.3985 | 0.3117 |
purelin | tansig | 0.82875 | 0.1136 | 0.1685 | 0.1184 |
logsig | logsig | 0.83831 | 0.2505 | 0.2502 | 0.1884 |
tansig | logsig | 0.85305 | 0.2536 | 0.2518 | 0.195 |
purelin | logsig | 0.68672 | 0.2955 | 0.2718 | 0.2067 |
After we ran a simulation with the optimal MLP-ANN architecture, the result of the regression analysis is shown in Fig. 6.

Scatter plot of predicted values versus observed values for MLP-ANN.
After running the simulation process through the 3-51-1 MLP-ANN, the optimal results obtained are:
The results of the predicted versus observed values after the performed MLP simulation are shown in Fig. 7.

Plot of predicted values versus observed values for MLP-ANN.
The most appropriate selection of activation functions in the MLP layers was also performed. The tested combinations for the selected variants of the tansig, logsig and purelin functions are listed in Table 2.
The different activation functions tested show that three combinations have identical characteristics (logsig-purelin, tansig-tansig and logsig-tansig), which are close to tansig-purelin without reaching them.
The study of the prediction of the compressibility factor
An RBF-ANN was developed with three input nodes for temperature, pressure and

Scatter plot of predicted values versus observed values for RBF-ANN.

Plot of predicted values versus observed values for RBF-ANN.
Influence of hidden neurons of RBF-ANN.
Spread value | Neurons | ||||
---|---|---|---|---|---|
0.1 | 140 | 0.99899 | 0.00073 | 0.0135 | 0.0075 |
0.3 | 140 | 0.99742 | 0.0019 | 0.0215 | 0.0108 |
0.5 | 140 | 0.99477 | 0.0038 | 0.0306 | 0.014 |
0.1 | 130 | 0.9973 | 0.0019 | 0.022 | 0.0141 |
0.3 | 130 | 0.99257 | 0.0053 | 0.0365 | 0.0181 |
0.5 | 130 | 0.99272 | 0.0052 | 0.0361 | 0.0177 |
0.1 | 120 | 0.9936 | 0.0046 | 0.0339 | 0.02 |
0.3 | 120 | 0.98875 | 0.0081 | 0.0449 | 0.0268 |
0.5 | 120 | 0.98833 | 0.0084 | 0.0457 | 0.0266 |
The aim of this study is to apply a machine-learning approach to predict the compressibility factor of natural gas depending on the data of three input parameters:
The developed neural networks based on the MLP architecture were evaluated for two learning algorithms: LM and SCGD. The results of the comparative analysis show the better performance of the LM algorithm. This is observed in the error values and the value of the coefficient of determination
The characteristics of MLP-ANN for different activation functions in the individual layers were investigated. Experiments were performed with different variants of the tansig, purelin and logsig functions. The results obtained show best values for two of the test variants - tansig-tansig and tansig-purelin. For the second combination,
The effectiveness of the modeling process of the compressibility factor of natural gas by MLP-ANN and RBF-ANN was evaluated. The comparison of the models based on the investigated performance indicators
Comparison between MLP and RBF models.
Type ANN | ||||
---|---|---|---|---|
MLP-ANN | 0.99032 | 0.0581 | 0.1206 | 0.087 |
RBF-ANN | 0.99899 | 0.000729 | 0.0135 | 0.0075 |
From the values obtained for the indicators, it can be summarized that they are quite high and very close to similar results obtained by other researchers using the same methods. The correlation coefficient
Fig. 10 shows a comparison of the relative errors (

Comparison of relative errors of the MLP-ANN and the RBF-ANN model.
The current study presents a comparative analysis of two intelligent approaches based on ANN for modeling the compressibility factor of natural gas. Real data from sensors and devices in a gas distribution station on the territory of the Republic of Bulgaria were used for the study. The capabilities of MLP-ANN and RBF-ANN for predicting the
The ANN approach shows very good abilities and characteristics of the developed models, which can be successfully used for the prediction of the compressibility factor of natural gas. From the results of the comparison of the two methods, it can be concluded that the RBF-ANN has better characteristics. The best values of
The graphical interpretation of the comparison between the relative errors for the two models shows that the RBF-ANN model has a clear advantage. The error values
In this study, the compressibility