Feeder loss estimation of transformer in long-short memory network, based on FCM clustering

The distribution network feeder line loss rate, as a key indicator for measuring the economic efficiency and technical management level of power supply enterprises, is not only directly related to the power management efficiency of enterprises, but also closely related to factors such as the scientific design of regional distribution networks, technical application effects, equipment operation status, personnel professional level, and overall management efficiency [1-2]. Optimizing the composition of distribution network line losses is a direct way for power supply enterprises to improve economic efficiency, and it is also an important means to achieve increased income and efficiency. With the continuous improvement of the operation mechanism of power supply enterprises, the control standards for line loss rate are becoming stricter. In this context, the importance of deepening the analysis of distribution line loss is becoming increasingly significant [3].

The core value of line loss analysis lies in evaluating the rationality of the operation of the distribution system, identifying problems in operation strategies, distribution architecture, equipment efficiency, metering equipment accuracy, and electricity management, and providing scientific basis for formulating accurate loss reduction strategies [4]. Setting a reasonable benchmark value for line loss rate plays a key decision-making support role for power supply enterprises to scientifically plan line loss management and formulate effective loss reduction measures, which contains significant economic and social benefits. However, current research in this field is still lacking. Reference [5] proposes a calculation strategy for line loss rate under three-phase equilibrium state. This method improves the previous electricity loss standard to represent the severity of electricity theft, and uses the line loss rate range standard to represent the severity of electricity theft. However, its limitation is that it only considers two factors: electricity loss and three-phase imbalance. Reference [6] considered the characteristics of load in the classification analysis of distribution networks, and set the benchmark value of distribution network feeders as the median value for their typical distribution, which is mainly located at the median value ±0.5%. However, this method has strong subjectivity, and the universality of its benchmark value determination method is limited. In addition, given the large number of feeders in the distribution network, manual setting methods are clearly not effective in meeting application needs, and their practical application value is not high.

With the improvement of artificial intelligence level, data-driven line loss rate prediction models have become a research focus, mainly including two types: (1) in a single estimation model, literature [7] uses fast independent component analysis method for feature selection of feeder line loss data, and then uses support vector regression to predict the feeder line loss of the distribution network; Reference [8] improved the support vector regression model by using evolutionary computation particle swarm optimization, significantly enhancing the accuracy of predicting feeder line losses in distribution networks; Reference [9] is based on the traditional grey relational analysis method to screen the data features of feeder line losses in distribution networks. Through this data preprocessing, the prediction accuracy of subsequent neural network models is improved. Reference [10] first reconstructs the characteristics of feeder line loss data using a denoising auto-encoder, and then uses a long short-term memory network to predict the feeder line loss rate; Reference [11] adopts a deep transfer learning strategy to predict and analyze the network loss problem of distribution lines containing distributed new energy sources. (2) In the multi model fusion model, reference [12] is based on the Bootstrap Aggregating base learner model and integrated into the random forest algorithm to achieve line loss rate prediction of distribution network feeders; Reference [13] uses a Boosting strong classifier to improve the performance of Extreme Gradient Boosting Tree (XGBoost) and applies it to predict the line loss rate of distribution network feeders; Reference [14] applies stacked generalization techniques to predict feeder line losses in distribution networks, where the meta model uses gradient boosting trees and the base model uses traditional machine learning models. These studies have improved the accuracy of line loss rate prediction to a certain extent, but the models used are all machine learning models, which suffer from insufficient feature mining and generalization ability when dealing with complex scenes.

This article proposes a method for estimating feeder line loss, which integrates fuzzy C-means clustering and long short-term memory network Transformer model. Firstly, the fuzzy C-means clustering technique is used to group the data related to feeder line loss estimation. Then, independent line loss estimation is performed for each group, and the data is preprocessed to achieve more refined control of feeder line loss in the distribution network. In addition, this article also introduces an improved long short-term memory network Transformer model with a double-layer structure. The biggest advantage of this model is that it uses a multi attention head mechanism to fuse and generate features of distribution network feeder line loss data, and improves prediction efficiency and accuracy through parallel computing, thereby achieving efficient and accurate prediction of short-term distribution network feeder line loss.

2

Algorithm metrics and framework

2.1

Three dimensional indicator system for line loss rate

Given the diversity and complexity of the index system for feeder line loss rate in distribution networks, as well as the shortcomings in information integrity and real-time performance of current distribution networks, in order to improve the universality and operability of the method, while balancing the sensitivity of the index to line loss and the convenience of data acquisition, this research work constructs a three-dimensional evaluation framework for feeder line loss rate based on three core perspectives: line characteristics, operating parameters, and management level, as detailed in Table 1.

Table 1.

Evaluation System of Line Loss Rate Index

Dimension	Index
Line characteristics	Line current carrying capacity
	Cable conversion rate
	Capacity of distribution transform
	Power supply radius
Operation parameter	Average electricity consumption
	Maximum load rate
	Annual maximum current
Management level	Meter reading accuracy
Management level	Equipment aging rate

When constructing the indicator system for distribution line loss rate, the two core dimensions - line attributes and operating parameters - have a direct impact on the theoretical line loss rate, while the management factor dimension mainly affects the management of line loss rate. The specific application strategy of the indicators is explained as follows: Firstly, based on the screening principle of “high representation within the class and low correlation between classes”, two indicators, power supply radius and average power consumption, are selected from the line attributes and operating parameters, and the fuzzy C-means clustering algorithm [15] is used to finely classify the feeder lines. Secondly, to enhance the practicality of the theoretical line loss rate correction value, both the line properties and operating parameters are included in the ground state correction category of statistical calculation methods. In addition, management factor indicators are used to solve the management line loss coefficient in the optimization model of the benchmark value of the theoretical line loss rate of the feeder line.

1)

Regarding the power supply radius. It is defined as the physical length of the line within the interval from the power point of the distribution network to its power supply end. When calculating the equivalent resistance of the feeder line, this study adopts the equivalent resistance method for processing [16].

(1) $R_{dz} = \frac{\sum_{i = 1}^{m} S_{N i}^{2} r_{0} L_{i}}{S_{N Σ}^{2}}$

Where, $r_{0}$ is the unit value of the AC resistance on the i-th feeder of the distribution network, $Ω / km$ ; $S_{N Σ}^{}$ is the total rated capacity of the transformer, kVA; $R_{dz}$ is equivalent resistance, $Ω$ ; $L_{i}$ is the power supply radius parameter, km; $S_{N i}^{}$ is the rated capacity of transformers in the distribution network, measured in kVA. The magnitude of equivalent resistance is directly related to the power supply radius parameter, and has a significant impact on the feeder line loss of the distribution network.

2)

Line current carrying capacity. This indicator measures the maximum current value that a distribution network feeder can safely pass through, directly related to the current carrying potential of the line and the potential increase in line loss rate.

3)

Cable laying ratio. Defined as the proportion of cable length to overall length in the feeder line, distinguished from overhead lines, cable lines are prone to eddy current losses due to bundling effects, resulting in a higher overall line loss rate. Therefore, the increase in the proportion of cable laying is often accompanied by an increase in line losses.

4)

Rated capacity of distribution transformer. The design capability of the distribution transformer equipped on the feeder line is affected by the transformer model in its economic operating range. By optimizing the load rate to the economic range, the line loss can be minimized and the efficiency can be maximized.

5)

Annual average power supply. As the average value of annual active power supply, this indicator reflects the load level of the distribution network and has a certain correlation with the line loss rate.

6)

Annual peak withstand current. Under the premise of not endangering the safe operation of the equipment, the maximum current value that the feeder can temporarily withstand, exceeding this limit will cause equipment failure. Based on the traditional equivalent resistance method, effective calculation of feeder line losses in distribution networks can be achieved [17].

(2) $Δ P = 3 F^{2} I_{\max}^{2} R_{dz} t$

In the equation, t is the running time, h; Parameter F is the loss factor; $Δ P$ is the line loss, $km \cdot h$ ; $I_{\max}^{}$ is the maximum annual current, A. It can be seen that in addition to the loss factor, $I_{\max}^{}$ also appears in a quadratic form in the formula for calculating line losses, indicating that it also has a significant impact on line losses.

7)

Annual peak load rate. This indicator measures the maximum operating load of a transformer during normal operation throughout the year. In this study, it specifically refers to the highest load rate value recorded over an annual time span, and its calculation formula is defined as the ratio of the actual maximum load power of the transformer to its rated capacity [18].

(3) $η = \frac{S_{a}}{S_{r}} \times 100 %$

Where, $S_{a}$ is the apparent power, kVA; $η$ is the load rate; $S_{r}$ is the rated capacity, kVA. The losses of transformers mainly include two types: copper losses and iron losses. For transformers with the same capacity, the iron loss remains constant under changes in load rate; and copper consumption gradually increases with the increase of load rate.

2.2

Algorithm framework

Given the complexity of the distribution network structure, the non-uniformity of equipment operating parameters, the diversity of operating modes, and the limitations of big data analysis methods, the current efficiency of data application is low, and the setting of benchmark values for line loss rates mainly relies on historical data or management experience, lacking sufficient theoretical support. In response to this issue, this article proposes an optimization design framework that integrates fuzzy clustering and long short-term memory network Transformer, aiming to improve the accuracy of setting benchmark values for line loss rate. Figure 1 provides a detailed implementation process of the framework.

Firstly, extract feeder samples from the distribution network and comprehensively collect various parameters and feeder network topology information in the three-dimensional line loss rate index system. Subsequently, the fuzzy C-means long short-term memory network Transformer model was used to perform clustering analysis on the samples, and the optimal clustering results were determined through validity index testing, thus dividing various ground state feeders. On this basis, combined with the equivalent resistance method and statistical calculation method, the ground state correction calculation of the theoretical line loss rate is carried out based on the network topology of the ground state feeder. Finally, based on the classification principle, solve and obtain the benchmark values of optimized theoretical line loss rates for various types of feeders. This article mainly focuses on the design and implementation of the fuzzy C-means long short-term memory network Transformer model algorithm mentioned earlier. The last two processes are described in reference [19-20].

3

Predictive analysis algorithm

3.1

Data preprocessing

Given the fuzziness of category boundaries in actual classification, this study adopts the fuzzy C-means clustering algorithm to preprocess feeder data. The specific operation process is as follows:

Step 1: Data selection. Carefully select two key indicators, power supply radius and average electricity consumption, from operating parameters and line attributes, which have high intra class representativeness and the lowest inter class correlation. Subsequently, these indicators are subjected to dimensional normalization to ensure consistency and comparability of the data.

Step 2: Fuzzy clustering preprocessing. In order to achieve the goals of detailed classification, visualization of data, compactness within classes, and clear separation between classes, multiple clustering attempts were made on samples with different numbers of classifications. Through repeated iterations and comparisons, an effective clustering result is ultimately determined that best meets the requirements of classification detail and data visualization. Assuming that in the given sample set $X = \{x_{1}, x_{2}, \dots, x_{n}\} \subset R^{m}$ , the number of indicators is m and the number of clustering objects is n. The FCM algorithm is [21]: (4) $Min J_{f c m} (U, V) = \sum_{k = 1}^{s} \sum_{i = 1}^{n} u_{k i}^{e} d_{k i}^{2}$ (5) $s .t . \{\begin{array}{l} \sum_{k = 1}^{s} u_{k i} = 1, 1 \leq i \leq n \\ \sum_{i = 1}^{n} u_{k i} > 0, 1 \leq k \leq s \\ u_{k i} \geq 0, 1 \leq k \leq s, 1 \leq i \leq n \end{array}$

Where, $U$ is the membership matrix; $V$ is the membership value of sample $x_{i}$ belonging to category k; $V = [v_{1}, v_{2}, \dots, v_{s}]$ is the class center matrix with a size of m×s; e is the fuzzy factor; s is the clustering dimension; $d_{k i} = ‖x_{i} - v_{k}‖$ is the distance from $x_{i}$ to the center of the region $v_{k}$ .

Step 3: Classification of ground state feeder lines. Based on the clustering membership matrix, identify each ground state feeder line according to the principle of the highest membership degree. The ground state feeders are labeled as $I_{k c}$ and $k \in \{1, 2, \dots, s\}$ , where s represents the clustering dimension and k represents the category.

3.2

Long Short Term Memory Network

Long Short Term Memory Network (LSTM) [22], as a variant of Recurrent Neural Network, effectively maintains and regulates information flow by introducing three special cell state regulation mechanisms: input gate, forget gate, and output gate, while capturing long-term dependencies between data. This design alleviates the difficulties of gradient vanishing and exploding encountered by recurrent neural networks when processing training data. Specifically, long short-term memory networks can dynamically adjust their cellular state content based on input sequence information, and generate new memory states by integrating the current time step input with previous memory states, thereby ensuring the model’s ability in memory and reasoning.

Figure 2 shows the recurrent neural network architecture of the long short-term memory network, where the long short-term memory network units consist of two core states: hidden state $h_{t}$ and unit state $C_{t}$ . $σ$ stands for sigmoid activation function, whose function is to limit the gate output value between 0 and 1, achieving fine control over the flow of information in the neural network. In addition, the hyperbolic tangent function tanh reflects the characteristics of internal state changes in long short-term memory networks; Parameters are input data for sequence r.

3.3

Transformer neural network

The Transformer architecture [23] is an innovative sequence to sequence prediction model that consists of two major components: an encoder and a decoder. The encoder is responsible for mapping the input historical feeder line loss data and meteorological information to a high-dimensional feature space, forming feature vectors that provide the decoder with accurate predictions based on contextual dependencies when generating output sequences. The encoder is composed of multiple layers stacked repeatedly, each layer containing a multi head self attention module and a feedforward neural network, and incorporating layer normalization and residual connection techniques to enhance model stability.

In the Transformer model, by performing a weighted sum operation on the historical input data sequence, the self attention mechanism can dynamically adjust the weights of various influencing factors, allowing the model to focus on any key position of the input line loss data sequence, rather than being limited to a fixed time window. As shown in Figure 3, the application of multi head attention mechanism enables the Transformer model to pay parallel attention to multiple different regions of the input sequence matrix X, endowing the model with global insight into time series and thereby improving its representation performance, making it perform better in processing global and local features of feeder line loss data sequences.

The composition of attention mechanism mainly includes three parts: matrix $Q$ , key moment $K$ , and value matrix $V$ . In Figure 3, N is the attention head number; The parameters $Q_{i}$ , $K_{i}$ , and $V_{i}$ respectively represent the values of the matrix, key moment, and value matrix elements in the attention head i; The parameters $W_{i}^{Q}$ , $W_{i}^{K}$ , and $W_{i}^{V}$ respectively represent the weight coefficient matrices of $Q$ , $K$ , and $V$ in the attention head i; $Z$ is the output matrix; $Z_{N}$ represents the output matrix of attention head N.

3.4

Long Short Term Memory Network Transformer Hybrid Neural Network

The prediction of feeder line loss is a task involving multivariate temporal analysis. Although long short-term memory networks have shown reliability in capturing multivariate temporal features, their multi gate computing mechanism increases computational complexity and is prone to problems such as long training time, high risk of overfitting, and difficulty in parallelization when processing long time series. In addition, when long short-term memory networks process long-term information, some time periods of data may be ignored or overwhelmed due to temporal noise, affecting their predictive performance. In contrast, the Transformer model can comprehensively consider all positions of the input sequence in a single processing, without being limited by the distance between features, and its parallel computing characteristics significantly improve the training efficiency of the model [24].

In view of the above issues, this article proposes a fusion model of long short-term memory network Transformer, which combines the self attention mechanism of Transformer and the sequence modeling advantages of long short-term memory network, and can optimize and improve the accuracy of feeder line loss prediction. Specifically, a long short-term memory network encoder is used to perform preliminary encoding of input data, followed by the use of a Transformer decoder to obtain the desired output data sequence. As shown in Figure 4, by adding a long short-term memory network module before the multi head attention module of the Transformer, the algorithm performance can be improved through fusion. Among them, $X_{l}^{τ}$ represents the time series set processed by the long short-term memory network module; $h_{l}^{τ}$ is the output data sequence of the memory network; $X_{}^{τ}$ covers preprocessed equipment operating status factor quantities and historical feeder line losses and other time-series data sequences.

The long short-term memory network layer utilizes its unique gating mechanism to finely regulate the abandonment ratio, addition amount, and output amount of multidimensional device operating factors and feeder line loss information within the cellular state, effectively extracting key temporal features from multivariate data and forming valuable hidden state vectors. Furthermore, the self attention mechanism of the Transformer model is introduced to focus on a subset of features that are highly relevant to the current prediction task, reducing the allocation of attention to non core environmental factors. At the same time, historical state information is learned and fused with current observation data to jointly predict the degree of line loss. The specific calculation process is as follows: (6) $\{\begin{cases} X_{l}^{τ}, h_{l}^{τ} = L (X^{τ}, g (h_{l}^{τ - 1})) \\ {\tilde{X}}_{l}^{τ} = F (X_{l}^{τ}, X^{τ}) \\ G^{τ} = T (X_{l}^{τ}) \end{cases}$

Where, L represents the results $X_{l}^{τ}$ and $h_{l}^{τ}$ output from the long short-term memory network module, which are input to the multi head attention layer loop without gradient update, aiming to integrate the hidden state of the previous long short-term memory network module into the current input. g is a decoding function for hidden states, used to generate the output ${\tilde{X}}_{l}^{τ}$ of the fusion layer. F represents the combination operation of numerical values $X_{l}^{τ}$ and $X^{τ}$ within the fusion layer, aimed at integrating information from different sources. Finally, the predicted value of feeder line loss output by the model is represented by the symbol $G^{τ}$ . T indicates that the vectors introduced in the multi head attention layer have gone through the corresponding computational process. During the training phase, the multi head attention layer module adopts a segmented recursive strategy. Each external input is extracted from the output of the last segmented long short-term memory network layer, but gradient updates are not performed during this process. By utilizing the hidden states in the long short-term memory network module, the long short-term memory network Transformer model exhibits stronger robustness during training and prediction processes.

In the fusion layer architecture, the output vectors of the long short-term memory network layer are cascaded with the corresponding input vectors at different positions, and then the fused vectors are passed to the activation function layer. Selecting Gaussian Error Linear Unit (GeLU) as the activation function, dynamically adjusting its weights and biases to adapt to diverse input distributions and task requirements. This mechanism enhances the ability of the long short-term memory network Transformer model to learn and represent complex functions and features in feeder line loss prediction tasks, thereby improving the accuracy and robustness of prediction. Given that each sub layer has adopted residual connections and normalization processing, the output expression of each sub layer is: (7) $O_{u} = N (j + b (j))$

Where, j is a specific output quantity; $O_{u}$ represents the fusion result between the outputs of the two sub layers before and after; $b (j)$ represents the output of the preceding sub layer; N represents the output of the current sublayer. The original Transformer structure performs normalization on the inner product of $Q$ and $K$ through the activation layer of the Transformer structure. However, the problem is that most of the labeled values in $Q$ lack clear predictive features. To solve this problem, the Long Short Term Memory Network Transformer model first calculates the uniform distribution difference, selects the features $Q$ with higher activity, and then performs inner product with another vector $K$ , and applies the Softmax function for probabilistic approximation. Compared to the original Transformer model, the long short-term memory network Transformer model can effectively reduce computational complexity while still capturing input sequence information, demonstrating superior performance.

In the Long Short Term Memory Network Transformer architecture, the first layer of the network plays the role of an encoder, responsible for transforming time series data into vector representations; Subsequently, the second layer Transformer serves as a decoder for further processing. This model first uses long short-term memory networks to extract local features of sequence data, and then captures global feature relationships through Transformers, aiming to improve the accuracy of feeder line loss prediction.

4

Experimental analysis

4.1

Experimental setup

This article uses data from 1121 feeder lines in a city in Guangdong Province to verify the effectiveness of the proposed method. The data is sourced from the line loss management system to ensure accurate quality, covering theoretical line loss rate, statistical line loss rate, power supply, line length, rated capacity of distribution transformers, line operation time, and distribution transformer operation time. To enhance the reliability of the validation, the data was divided into 10 equal parts based on the cross validation principle, and divided into training set, validation set, and test set in a ratio of 7:2:1. The final model estimation result was taken as the average of 10 cross validations.

To evaluate the accuracy of the model’s estimation of line loss rate, this paper introduces Mean Absolute Error (MAE) and Root Mean Square Error (RMSE) as error measurement indicators. RMSE has a high sensitivity to outliers and can effectively highlight large deviation data; MAE avoids the cancellation of positive and negative errors, ensuring the comprehensiveness of error assessment. The calculation formulas for the two indicators are expressed as follows [25]: (8) $\{\begin{cases} RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {(y_{pred} - y_{true})}^{2}} \\ MAE = \frac{1}{n} \sum_{i = 1}^{n} |y_{pred} - y_{true}| \end{cases}$

Where, n is the number of feeder lines; $y_{true}$ is the true value of line loss; $y_{pred}$ is the estimated value of line loss. According to formula (8), it can be seen that the larger the values of the MAE and RMSE evaluation indicators of the estimation model, the more unreasonable the line loss rate estimation results and the lower the accuracy of the estimation.

4.2

Comparison of Estimation Results among Various Models

To verify the superiority of the method proposed in this article, a linear weighted model was also introduced as a reference, whose expression is: (9) $y_{pred} = \sum_{m = 1}^{b} w_{m} y_{pred . m}$

Where, b is the number of base estimation models; $w_{m}$ is the allocation of weights; $y_{pred . m}$ is the estimated result.

This study adopts two strategies for weight allocation. The first strategy is Weighted Sum (WS), which sorts the estimation errors of each basic estimation model, selects the four models with the smallest errors, and assigns weights of 0.45, 0.35, 0.25, and 0.15 in descending order. The second strategy utilizes differential evolution optimization weighting (DEOW) algorithm, which constructs an optimization function aimed at minimizing estimation error and uses differential evolution optimization techniques to search for the optimal weight combination. The specific expression of the optimization function constructed is: (10) $\{\begin{cases} f = {‖y_{true} - \sum_{m = 1}^{6} w_{m} y_{pred . m}‖}_{2}^{2} \\ s .t . 0 ⩽ w_{m} ⩽ 1 \sum_{m = 1}^{b} w_{m} = 1 \end{cases}$

The following is a comprehensive evaluation of the performance of each model on three types of feeder line test sets. Table 2 provides a detailed comparison of the root mean square error (RMSE) and mean absolute error (MAE) of each model in terms of online loss rate estimation. Furthermore, Figure 5, Figure 6, and Figure 7 respectively present the distribution of line loss estimation errors for each type of feeder line in different models.

Table 2.

Comparison of Line Loss Rate Estimation Errors

Model	First category		Second category		Third category
Model	MAE	RMSE	MAE	RMSE	MAE	RMSE
GBDT	0.358	0.519	0.297	0.435	0.319	0.428
AdaBoost	0.358	0.506	0.346	0.437	0.348	0.432
XGBoost	0.361	0.487	0.314	0.469	0.305	0.371
WS	0.347	0.472	0.281	0.396	0.324	0.409
DEOW	0.334	0.493	0.276	0.408	0.351	0.478
Proposed algorithm	0.309	0.426	0.268	0.375	0.283	0.365

Firstly, the predictive performance of five basic estimation models for various types of feeders was explored. In addition to the algorithm mentioned in this article, the WS algorithm model has the lowest root mean square error (RMSE) when predicting the first and second types of feeders. However, when predicting the third type of feeder line, the performance of the WS algorithm model is not outstanding. On the contrary, the RMSE of the XGBoost model reaches its minimum, with a specific value of 0.37, and its average absolute error (MAE) is also the smallest value among this type of feeder line. When predicting the third type of feeder line, the XGBoost model’s error is mainly concentrated in the range of 0.21-0.32, and there are few outliers. However, when predicting the first and second types of feeders, XGBoost’s performance is not optimal, indicating that a single basic estimation model has limitations in predicting specific types of feeders. Next, we compared the linear weighted model with the baseline estimation model. When predicting the first and second types of feeders, the RMSE and MAE of the linear weighted model have decreased compared to the optimal baseline estimation models for various types of feeders. This is due to the quadratic integration of the prediction results by the linear weighted model, which improves the prediction accuracy. However, when predicting the third type of feeder, the overfitting problem caused by the small sample size results in lower predictive performance of the linear weighted model compared to the optimal baseline estimation model. This result indicates that although linear weighted models can improve prediction accuracy to some extent, there are still shortcomings when dealing with small sample data. In summary, neither a single estimation model nor a linear weighted model can achieve accurate prediction of line loss rates for multiple types of feeders. The prediction ability of a single model on a specific type of feeder is limited. Although the linear weighted model can further improve the prediction accuracy, it is prone to over fitting when dealing with small sample data. Therefore, we need to explore new prediction methods to achieve accurate prediction of various types of feeder line loss rates.

The model proposed in this article effectively alleviates the overfitting problem and improves the estimation accuracy of various types of feeder line loss rates. Specifically, in the estimation of line loss rates for the first and second types of feeders, the root mean square error (RMSE) of this model was reduced by 5.3% and 8.2% compared to the optimal base model for each type of feeder, respectively. This significant advantage is mainly attributed to the application of expert neural networks in the model, which can deeply explore the nonlinear relationship of line loss data, thereby achieving higher estimation accuracy than the base model and linear weighted model for feeder categories with sufficient training samples. Although the RMSE of this model is slightly higher than that of the XGBoost model by 1.7% on the third type of feeder with less training data, compared to the linear weighted model, its RMSE is reduced by 6.81%, demonstrating a certain degree of resistance to overfitting. Furthermore, from the perspective of Mean Absolute Error (MAE), compared with the linear weighted model, our method reduces MAE by 6.9%, 5.1%, and 11.5% respectively when estimating three types of feeders (as shown in Table 2). Meanwhile, compared with the optimal basic models of various types of feeders, the MAE of this method also decreased by 8.7%, 7.6%, and 5.2%, respectively. In addition, it can be seen from the error distribution diagrams (Figures 6-8) that the method proposed in this paper has a more concentrated and biased error distribution towards low error areas in estimating the line loss rate of each type of feeder line, and the median of the error distribution is close to the average, which further verifies the superiority of the proposed model in terms of accuracy and stability.

In order to comprehensively evaluate the feasibility of our method, we also compared it with the methods in references [26] and [27]. The specific comparison results are shown in Table 3.

Table 3.

Comparison of Estimation Errors of Different Algorithms

Method	First category		Second category		Third category
Method	MAE	RMSE	MAE	RMSE	MAE	RMSE
Reference [26]	0.365	0.478	0.296	0.396	0.317	0.396
Reference [27]	0.359	0.468	0.287	0.387	0.328	0.413
Proposed algorithm	0.317	0.446	0.273	0.371	0.287	0.375

According to the data in Table 3, the model proposed in this paper exhibits lower error levels in root mean square error (RMSE) and mean absolute error (MAE) compared to the other two algorithms. For example, in the estimation of the first type of feeder line, the MAE of our method is reduced by 13.15% compared to the method in reference [26]. In the estimation of the third type of feeder line, the RMSE of our method is reduced by 5.30% compared to literature [27]. This advantage can be attributed to the fact that the meta estimation models used in references [26] and [27] are traditional machine learning models, which have weaker data mining capabilities compared to the hybrid expert system (MoE) used in this paper. In summary, compared to existing line loss estimation models, the two-layer estimation model based on ensemble tree model and hybrid expert system proposed in this paper demonstrates better performance in estimating statistical line loss rates for different types of feeders.

5

Conclusion

This article proposes an innovative method for estimating feeder line loss, which combines fuzzy C-means clustering and long short-term memory network Transformer model. The specific contributions are summarized as follows: 1) A three-dimensional feeder line loss rate indicator system has been constructed, which covers three dimensions: line attributes, operating parameters, and management factors. The system fully considers the contribution rate of each indicator to the feeder line loss rate and the ease of data acquisition, ensuring the practicality and operability of the indicator system. 2) In response to the problem of a large number of feeders in the distribution network, this paper adopts the fuzzy C-means clustering algorithm, combined with the massive multi-source big data of the power grid monitoring system, to effectively divide complex feeders into a finite number of typical types. This method is based on the principle of similarity of similar feeders, setting theoretical benchmark values for line loss rates for various types of feeders, significantly reducing the dimensionality of the research object and greatly improving work efficiency. 3) A dual layer architecture long short-term memory network Transformer model is introduced, which generates efficient prediction vectors by combining multiple attention heads with feeder line loss data feature quantities. The parallel computing characteristics of the attention head enable the model to efficiently and accurately predict short-term feeder line losses.

Future research can focus on the following key areas to deepen development: 1) Data preprocessing and quality control improvement: Committed to developing new data preprocessing strategies to enhance data integrity and quality. Specific measures include innovation in data cleaning techniques, optimization of interpolation methods, and refinement of outlier detection algorithms, aimed at ensuring the accuracy and reliability of clustering and prediction models. 2) Clustering algorithm and parameter adaptive innovation: Explore and apply cutting-edge clustering techniques such as adaptive fuzzy C-means clustering, density clustering, etc., aiming to improve clustering accuracy and stability. At the same time, in-depth research on adaptive parameter selection mechanisms aims to minimize the potential impact of human intervention on clustering results. 3) Model generalization performance enhancement: By integrating more feature variables, constructing complex network architectures, or adopting ensemble learning strategies, the model’s generalization ability can be enhanced. In addition, exploring model transfer learning techniques and utilizing feeder data across regions and conditions can enhance the adaptability and prediction accuracy of the model in new environments. 4) Strategy for improving computational efficiency: Develop efficient algorithms and combine them with hardware acceleration technologies such as distributed computing frameworks and GPU acceleration to reduce the computational burden of Transformer models. At the same time, exploring model compression and pruning techniques to reduce model size, further accelerate the calculation process, and improve overall efficiency.

Język:: Angielski

Częstotliwość wydawania:: 1 razy w roku
Dziedziny czasopisma:: Nauki biologiczne, Nauki biologiczne, inne, Matematyka, Matematyka stosowana, Matematyka ogólna, Fizyka, Fizyka, inne

Kanał RSS czasopisma

Feeder loss estimation of transformer in long-short memory network, based on FCM clustering

Chang Liu

Lin Xu

Qian Xie

Hua Zhang

Hua Yang

Shu Fang

Wei Wang

Shixuan Lv

Yinzhang Cheng

Guanliang Li

Data publikacji: 24 wrz 2025

Otrzymano: 10 sty 2025

Przyjęty: 05 maj 2025

DOI: https://doi.org/10.2478/amns-2025-0995

Słowa kluczoweFuzzy mean clustering, Long short-term memory network, Transformer, Feeder line loss, Machine learning, Line maintenance, Distribution network

© 2025 Songyu Wu et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Słowa kluczowe
Fuzzy mean clustering, Long short-term memory network, Transformer, Feeder line loss, Machine learning, Line maintenance, Distribution network