High-Dimensional Feature Optimization and Real-Time Prediction Model with Support Vector Machines for Fault Diagnosis of Electrical Equipment
Pubblicato online: 19 mar 2025
Ricevuto: 25 ott 2024
Accettato: 03 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0480
Parole chiave
© 2025 Lei Li et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
In today’s continuous development of electrification in China, whether it is important equipment such as generators, transformers, circuit breakers, or auxiliary equipment such as capacitors, insulators, transformers, etc., once a fault failure occurs, it will surely cause a partial or even total blackout in the area, which will jeopardize the daily life of the nationals and the production of the enterprises, and even bring danger to the people’s lives and make the country’s economy to cause a huge loss [1–3].
Social demand for power equipment is increasing, equipment maintenance costs are also increasing, according to statistics, equipment maintenance management costs account for 65% to 85% of the total cost of construction equipment management, once the failure may result in huge cost expenditures and other significant adverse effects, so the maintenance and management of power equipment is exceptionally critical [4–6]. However, the traditional management mode is relatively single, generally taken after the failure to deal with the way, often can not monitor the equipment operating status in a timely manner, often in the event of equipment failure can not run before the start of maintenance work, so the key issue of guaranteeing the normal operation of the equipment is to grasp the operating status of the electric power equipment, designed to detect in advance and deal with the problem of equipment failures in a timely manner, and the state of maintenance is based on the equipment real-time monitoring data to determine its Conditional maintenance is based on real-time monitoring data to determine the health status of the equipment to determine whether the equipment needs maintenance [7–10]. Early detection of power equipment faults and preventing them from occurring can ensure the normal operation of power equipment and improve the reliability of grid power supply, which is of great significance [11–12].
With the increase in the frequency of use of electrical equipment, people have higher and higher requirements for the reliability of electrical equipment, and the fault diagnosis of electrical equipment has become a major topic in the research of electrical field [13–14]. Electrical equipment faults have randomness, ambiguity and uncertainty. When a fault occurs, there is usually a phenomenon that a certain fault state is related by a variety of information data, or a kind of information data reflects the occurrence of multiple faults, and different fault states may coexist [15–17]. For these characteristics, a suitable fault diagnosis model based on multi-source information is needed for equipment state identification [18]. Therefore, the information fusion diagnosis and condition assessment of power equipment faults is not only academically important for scientific research, but also has good engineering application value [19–21]. First, through a variety of sensors from the accompanying faults of multi-phenomenon observation, analysis, exploration of insulation fault occurrence, development mechanism and potential laws, in order to facilitate a more comprehensive and in-depth explanation of the fault evolution process, the development of more accurate fault state identification and early warning methods, and second, the use of multi-source information differences and complementarities to make up for the defects of the detection based on a single piece of information, make full use of various types of information, relevant expert experience, integration of the The key factors of power equipment fault causation and development are comprehensively analyzed to obtain stable and reliable information characterizing the fault state, which facilitates the development of a more fault-tolerant and reliable fault diagnosis and evaluation system, improves the speed of system information processing and the correctness of decision-making, and also changes the traditional research ideas and direction of independent diagnosis of each sensor [22–25].
The article combines principal component analysis to study the fault diagnosis problem of electrical equipment, and proposes a fault diagnosis model based on optimized parameter SVM, which extracts the key fault features of electrical equipment through PCA, and realizes the preprocessing of equipment fault information and data dimensionality reduction. Subsequently, a fault classification method based on hierarchical binary trees is given, and parameter optimization is carried out using the empire competition algorithm. The optimized parameters are used to construct the MC-HBT classifier. The constructed ISFD-POPS model is tested experimentally for diagnosis of five types of electrical equipment faults, and then the electrical equipment fault prediction model is established on this basis, i.e., the electrical equipment fault prediction model of the limit learning machine is improved by the improved sparrow algorithm, and based on the experimental analysis of the actual data of 22 sets of known faults, the good prediction performance of the model is proved.
Principal Component Analysis (PCA), also known as Principal Component Analysis, is commonly used in data statistical techniques and is modeled as a PCA model. That is, the sample matrix is centered by the PCA method, and from this, its covariance matrix and its eigenvalues are derived to form a vector of component patterns (eigenvalues are arranged from largest to smallest). According to the size of the cumulative variance contribution rate, the dimensions with smaller influence factors are discarded to realize data dimensionality reduction. The principal element analysis method doesn’t require the creation of a mathematical model of the fault, but rather a statistical model of the data [26]. The principal element analysis method takes the original measurement data through data dimensionality reduction processing, extracts the features and information with large data influence factors, and eliminates the correlation between the data and inter-system noise. The principal element analysis method is widely used in data compression, fault diagnosis and prediction, signal processing, fuzzy identification, and other important fields.
Principal Component Analysis (PCA) is a multivariate statistical method that can perform data dimensionality reduction and can retain the main feature information of the data. Combined with PCA mathematical principles, the following are as follows:
Assume that the measurement data matrix is
The model satisfies the conditions:
Where
The principal element analysis is geometrically interpreted to mean that the sample has an elliptical distribution in the two-dimensional plane, assuming that the number of variables in the sample is 2 and the number of samples is

Geometric diagram of principal component analysis
In the new coordinate
The original data, i.e., the covariance matrix of the input samples is set to
The orthogonal matrix is
The correlation coefficient matrix has eigenvalue
Then we have
This results in an uncorrelated composite variable for the random variables and the composite variable decreases in descending order according to variance. The new composite variables are as follows:
The number of principal elements has an impact on the completeness of information, and the cumulative variance contribution method is commonly used to determine the number of principal elements. In this paper, a method of determining the number of principal elements jointly with the mean value of the complex correlation coefficient (MCC) combined with the cumulative variance contribution ratio (CPV) is used [27]. The cumulative variance contribution rate generated from the DS table was used as the basis for the initial determination of the number of principal elements, and the mean value of the complex correlation coefficient was examined, and iterative calculations and corrections were carried out to finally determine the number of principal elements.
Cumulative contribution to variance (CPV) method:
The size of the cumulative variance contribution rate is generally chosen to be 85% or more. In order to prevent subjectivity in determining the number of principal elements, the average value of the complex correlation coefficient is introduced to improve the cumulative variance contribution rate based on the cumulative variance contribution rate to determine the number of principal elements.
The specific algorithm for PCA dimensionality reduction for 80 sets of 19-dimensional data is as follows: A total of 80 groups of sample data to establish a sample library, define the covariance diml ~ dim19, where dim1 = Mysample (:1), dim2 = Mysample(:2), and so on. The variance of each dimension of the diagonal std(dim1)/(19-1), std(dim2)/(19-1). The cov function is called to compute the result cov (Mysample). The above process can be verified by the sample matrix centering function: x=Mysample-repmat (mean (Mysample), 19, 1), Find the eigenvalues of C based on the covariance matrix with eigenvectors [eigeavalues, eigenvectors] = eig(cov), and select the components to compose the pattern vector. The data after dimensionality reduction: FinalData=rowFeatureVector*rowDataAdjust, where the first term is the transpose of the pattern vector and is arranged in ascending order of feature values. The second term is the transpose of each dimension of data minus the mean value for that dimension. The cumulative contribution is calculated, discarding the less influential dimensions to ensure that the cumulative contribution to variance is greater than 0.85. Calculate the complex correlation coefficient The average value of the complex correlation coefficient is combined with the CPV algorithm to jointly determine the number of principal elements.
In order to achieve accurate diagnosis of electrical equipment fault problems, this chapter proposes an electrical equipment fault diagnosis model (ISFD-POPS) based on PCA and optimization parameter SVM. Firstly, the typical fault samples are selected by screening the fault information of the electrical equipment and equipment state discriminative decision library, and the typical fault samples of the electrical equipment are divided into the training set and the test set, and at the same time, the dimensionality reduction of PCA is carried out, so as to leave as many fault features as possible on the basis of not affecting the overall diagnosis of the model. Then the MC-HBT classifier was constructed according to the specific structure of electrical equipment, and the Gaussian radial basis (RBF) kernel function was selected as the kernel function of the ISFD-POPS model. Then, the parameters of the kernel function are optimally selected using ICA, and the MC-HBT classifier is trained using the training set to generate an optimized MC-HBT classifier with optimized parameters.
The ISFD-POPS model is shown in Fig. 2.

ISFD-POPS model
PCA can reduce high dimensional data to low dimensions and retain some of the most important features in the original data while removing noise and some correlated features, thus improving the processing efficiency of the data and reducing the time cost [28]. In this section, PCA is used to reduce the dimensionality of fault samples, and the following steps are generally followed when PCA is applied to fault diagnosis:
Assume that there are
First homogenize the matrix
Based on the homogenized matrix
Next, the eigenvalues
Select the eigenvectors
Where, the larger
Thus the matrix after dimensionality reduction can be obtained:
Calculate the contribution rate of all principal component vectors in turn, and the contribution rate
The core goal of SVM is to find an optimal classification hyperplane for classification through individual break training for linearly differentiable problems, which is a typical binary classification problem.
SVM nonlinear transformation has the following steps: Firstly, the SVM nonlinear mapping function is constructed, as shown in equation (18).
Next, the Langrange function is constructed as shown in equation (19).
Through the Lagrangian function, its dualized form can be obtained as:
Thus the objective function can be obtained, as shown in Eq. (21).
Where
Like all optimization algorithms, the Imperial Competition Algorithm (ICA) starts by randomly generating a number of initial solutions, and filtering out the optimal solution among these initial solutions by optimizing the objective function set. In this section, the RBF kernel function is selected as the kernel function for the ISFD-POPS model.
Where,
In this section, a multi-classification classifier based on hierarchical binary tree (MC-HBT) is proposed. In this section, the MC-HBT classifier is used as the basis for fault diagnosis in substations, and the MC-HBT classifier is shown in Fig. 3. (a: Normal. b: Circuit breaker steal trip. c: Control circuit disconnection fault. d: Operating mechanism oil (gas) pressure low blocking trip fault. e: Hydraulic mechanism oil pump pumping timeout. f: Circuit breaker interrupter room explosion.: Protection action circuit breaker refuses to divide: h: SF6 gas pressure low fault.)

MC-HBT classifier
In order to achieve accurate fault diagnosis, the ISFD-POPS model proposed in this section needs to be trained several times. By screening the fault information of electrical equipment, the extracted data are divided into training and testing sets, the RBF kernel function is selected, and ICA is used to optimize the selection of kernel function parameters to generate the MC-HBT classifier. The overall operation steps are as follows: The initial sample set is linearly transformed using PCA, and the transformed cumulative contribution rate is used to select the principal components. The linearly transformed sample set is downscaled by the selected principal components to obtain a new set of sample sets. Focus on training the new samples, and form the objective function using the new sample set and the SVM with unknown parameters.Step3: Use ICA to find out the optimized solution of the objective function, determine the optimized parameters of the SVM for fault classification of electrical equipment, and use the optimized parameters to build the first layer classifier. According to the order of MC-HBT classifiers, the training set is classified sequentially, and all classifiers are built according to the method in Step3 to form the MC-HBT classifier model for fault identification. Input the test samples into the ISFD-POPS model and calculate their predicted values.
The overall flow of the ISFD-POPS model is shown in Figure 4.

ISFD-POPS model flow
The selection of the types of electrical equipment faults studied in this section is based on two main considerations. On the one hand, the gradual faults of electrical equipment regulating valves are highly uncertain and not suitable for fault diagnosis studies. Therefore, abrupt faults are mainly studied in the subsequent fault diagnosis. On the other hand, considering the probability of each fault occurring in the actual working process, the main study will focus on some of the faults with a higher probability of occurrence. Considering the above two aspects, the types of faults that were studied were: five faults, namely, F1 valve clogging fault, F7 fluid overheating evaporation or critical flow fault, F8 actuator actuator twisting fault, F10 membrane head perforation fault, and F14 pressure sensor fault. The simulation diagrams of 5 kinds of faults (F1, F7, F8, F10, F14) are shown in Fig. 5 to Fig. 9. (In the figure, CV represents the output signal amplitude of the external controller, P1 and P2 represent the pre-valve and post-valve pressure magnitude of the control valve, x represents the amplitude of the stem displacement, F represents the amplitude of the medium flow rate in the main pipeline, T1 represents the amplitude of the fluid temperature, and the diagnostic fault signal variable.)

The signal amplitude changes in the time of failure F1

The signal amplitude changes when the F7 occurs

The signal amplitude changes when the F8 occurs

The signal amplitude changes when the F10 occurs

The signal amplitude changes when the F14 occurs
Both the training dataset and the test dataset were subjected to PCA dimensionality reduction, with 95% of the principal components selected, and the original 6-dimensional data was reduced to 3-dimensional data. The number of samples in both the training and test datasets is selected to be 550, and the fault diagnosis results of PCA-PSO-SVM and PCA-SVM are shown in Table 1 and Table 2. The fault diagnosis accuracy of the traditional method is 54.36%, while the fault diagnosis accuracy increases to 62.84% after applying the ISFD-POPS model proposed in this paper. Before applying the ISFD-POPS model proposed in this paper, the AUC value is 0.92, while after applying the ISFD-POPS model proposed in this paper, the AUC value is 0.99. This indicates that the improved method is superior compared to the original method.
Before using the model presented in this article
Prediction fault type | ||||||
---|---|---|---|---|---|---|
Actual fault type | ||||||
550 | 0 | 0 | 0 | 0 | ||
0 | 550 | 0 | 0 | 0 | ||
195 | 0 | 95 | 86 | 174 | ||
256 | 0 | 81 | 138 | 75 | ||
193 | 0 | 140 | 55 | 162 |
After using the model presented in this article
Prediction fault type | ||||||
---|---|---|---|---|---|---|
Actual fault type | ||||||
523 | 0 | 27 | 0 | 0 | ||
0 | 550 | 0 | 0 | 0 | ||
63 | 0 | 205 | 5 | 277 | ||
78 | 0 | 134 | 275 | 63 | ||
65 | 0 | 278 | 32 | 175 |
In this paper, the Extreme Learning Machine (ELM) model is used to predict the characteristic gas content inside the electrical equipment, so as to accurately predict future failures of the electrical equipment.
Extreme learning machine (ELM) is a new type of single hidden layer feed-forward neural network algorithm. With the rapid rise of neural networks in recent years, there are more and more neural network algorithms applied to the field of transformer fault prediction, single hidden layer neural network because of its very good learning ability, once applied, it quickly became a research hotspot, the algorithm works with the advantages of strong learning ability, fast learning speed, and strong generalization ability [29].
The theoretical background of the algorithm of ELM as a mathematical foundation is as follows: Interpolation Theorem The interpolation theorem states that the SLFNs of neurons of any By this SLFNs in the process of approximating the sample without mean error, also to satisfy Let Among them:
Provide a data set ( However, because the interpolation theorem has a certain oscillatory nature, which leads to a decrease in the generalization ability of the algorithm, based on this drawback, the approximation ability of the hidden layer neurons is investigated. Assuming that there exists According to the operational definition of the function paradigm, the formula used to describe the distance between the objective function Given a function Through the above two primitives can be seen, for the parameters in the limit learning machine, through training and adjustment can be approximated by any function, but this is only a simple explanation of the activation function for the incremental function or a generalized approximation of the RBF function, and did not give the corresponding instructions on how to get the optimal parameters, so many scholars have made a large number of studies for the optimization of the parameters of the two functions, and corresponding algorithms have also been proposed. This paper is based on the improved sparrow algorithm to optimize the parameters of these two functions, so that the limit learning machine model is more suitable for the prediction of the dissolved gas content of the transformer. ELM Model. The bias vectors between the input and implicit layers and the weight vectors between the implicit and output layers in SLFNs need to be determined. For the connection weight vector || Based on the above description ELM algorithm can be simply described as the following steps:
Given a training set ( Randomly generate the parameters ( Compute the output matrix Compute the connection weight vector Generation algorithm
The structure of the extreme learning machine is shown in Fig. 10.

Structure of extreme learning machine
From the above statement we can conclude the advantages of Extreme Learning Machine algorithm: The training speed is fast, only need to train once, no need to repeat the training. The algorithm generalizes very well and does not easily fall into local optimum. Disadvantages of Extreme Learning Machine: As a neural network, there is still overfitting. The random selection of input weights and implicit layer bias causes the algorithm to be not very effective when targeting to a particular problem.
Through the theoretical knowledge of ELM, the network structure diagram, as well as the advantages and disadvantages can be clearly seen, in the ELM network, the choice of connection weights and bias has a decisive role in improving the performance of the model, if you can choose the optimal bias and weight according to the characteristics of the different input data sets, the algorithmic model performance will be more excellent. Therefore, this paper is based on this, choose FA-ISSA to find the optimization of these two parameters, and further improve the prediction accuracy of the algorithm.
In this paper, the contents of the five gases,
Where
In this chapter, the lifting limit learning machine is used as the model for training and testing, and the error criterion selects the three variables of root mean square error, mean absolute error, and mean absolute percentage error as the evaluation criteria for the prediction accuracy of the prediction model. Sigmoid function is chosen for the activation function of the hidden layer. The connection weights between the input layer and the hidden layer, as well as the bias of the hidden layer neurons after FA-ISSA optimization, are selected as the initial network weights and bias. The values of the previous moments are used as inputs to predict the values of the current moment. That is to say, the value from 1 to

Prediction model
In this paper, 14 groups of five characteristic gases data measured for transformer No. 2 in a 110KV step-up substation of a nuclear power plant are used as the fault characterization data of the electrical equipment to verify the accuracy of the prediction model and to predict whether there is a fault in this transformer. Since the 14 sets of data are not equal interval data, Newton interpolation is utilized to preprocess the data into equal interval data. Therefore the fault data is supplemented to 22 groups of data, 20 of which are used as the training set, and the last 2 groups of data actually measured are used as the test set, and the transformer fault characterization gas data (
Transformer fault characteristic gas data(
Time | H2 | C2H6 | C2H2 | C2H4 | CH4 |
---|---|---|---|---|---|
Supplementary data | 22.1 | 110.1 | 0 | 67.6 | 32 |
2022/08/11 | 22.3 | 109.2 | 0 | 67.8 | 31.6 |
2022/09/01 | 21.6 | 107.8 | 0 | 66.7 | 31.7 |
2022/09/08 | 21.1 | 106.7 | 0 | 66.6 | 31.2 |
2022/10/11 | 19.9 | 106.5 | 0 | 64.5 | 30.6 |
Supplementary data | 20.2 | 104.8 | 0 | 64.2 | 30.9 |
2022/12/25 | 18.2 | 103.9 | 0 | 62.6 | 30.5 |
Supplementary data | 19.1 | 104.7 | 0 | 62.9 | 30.5 |
Supplementary data | 18.8 | 103.7 | 0 | 63.4 | 30.4 |
2023/01/01 | 18.6 | 102.4 | 0 | 62.5 | 30.7 |
Supplementary data | 18.7 | 102.7 | 0 | 62.5 | 31 |
2023/02/12 | 18.7 | 102.5 | 0 | 61.9 | 30.2 |
Supplementary data | 19 | 103.9 | 0 | 62.5 | 31.6 |
2023/04/12 | 19.3 | 103.5 | 0 | 63 | 31.5 |
2023/05/01 | 19.5 | 103.8 | 0 | 63.2 | 31 |
Supplementary data | 19.6 | 101.9 | 0 | 63.5 | 29.5 |
Supplementary data | 19.9 | 102.5 | 0 | 63.2 | 31 |
2023/07/02 | 21.2 | 104.5 | 0 | 64.5 | 31.5 |
2023/08/11 | 21.2 | 105.1 | 0 | 64.2 | 31.5 |
2023/10/11 | 21.6 | 108.6 | 0 | 64.4 | 30 |
2024/01/01 | 21.9 | 110.6 | 0 | 65.3 | 32.6 |
2024/01/15 | 22 | 115.2 | 0 | 65.1 | 33.3 |
Based on the H2 gas content data in the table, the optimal value and optimization seeking error are derived using the electrical equipment fault prediction model based on the lifting limit learning machine. The prediction results of ammonia gas at different dimensions are shown in Table 4. As can be seen from the table, the parameter optimization error appears to be significantly reduced at dimension 5, while a minimum value of 0.00241 appears at dimension 6, after which the optimization error basically remains near the minimum value. In order to reduce computational complexity, parameter optimization is not carried out at higher dimensions, and the optimal dimensional reference and parameter reference are used in the optimization process for other gases.
The prediction of ammonia in different dimensions
Dimension m | Parameter C | Parameter g | Parameter |
Parameter optimization error |
---|---|---|---|---|
2 | 6 | 220.3 | 0.000945 | 0.25 |
3 | 25.6 | 125.4 | 0.08 | 0.18 |
4 | 9.5 | 255.1 | 0.000945 | 0.344 |
5 | 72.14 | 46.8 | 0.145 | 0.0051 |
6 | 55.31 | 91.3 | 0.000945 | 0.00241 |
7 | 53.64 | 75.6 | 0.00088 | 0.00575 |
The fault feature prediction model is built according to the optimal dimensions and optimal parameters of H2, and the 20 sets of data from the training set are utilized to predict the 2 sets of test sets, and the H2 prediction results are shown in Fig. 12. As can be seen in the figure, the last two data points of H2 are test data, which are closer to the actual data, and their relative errors RE% are 3.51% and -3.68, respectively, and the average absolute error MAPE% is 3.3%, which is relatively small and basically meets the prediction requirements.

H2 prediction
In order to further verify the effectiveness of the prediction model, based on the existing H2 data, its training set is also made as a test set input into the constructed prediction model, and the comparison between the actual data of the training set and the predicted data is shown in Fig. 13. As can be seen from the figure, due to the insufficiency of the training samples, which is not enough to fully reflect the coupling relationship between the characteristic gases, as well as the existence of a certain prediction accuracy of the prediction model itself, resulting in the existence of a difference between the predicted values and the actual values of each set of predicted values, and the error is constantly increasing. Nevertheless, from the figure can also be seen that the trend of the predicted data and the trend of the actual data is the same, there is an increase in the error, but it is very small, which means that the use of the prediction model established in this paper to predict the unknown data, the value and trend of the actual value is also close to the prediction of the data can be put into the prediction after the establishment of the electrical equipment diagnostic model, to determine whether the equipment may be a thermal fault, which facilitates maintenance personnel to discover potential electrical equipment faults and maintain the equipment as early as possible to avoid electrical equipment failures.

The actual data of the training set is compared to the forecast data
The fault characteristic prediction model is built according to the optimal dimensions and optimal parameters of different gases, and the BP neural network is used to re-predict the characteristic data, and the prediction results are shown in Table 5 when compared with the prediction model built in this paper. Since the values of C2H2 in the given fault data are all 0, they are not given consideration. As can be seen from the table, except for the prediction of C2H6, the BP prediction model has a higher accuracy than the prediction model proposed in this paper, and the rest of them are much lower than the accuracy of the SVR prediction model, which in turn verifies the accuracy of the prediction model constructed in this paper, and at the same time the accuracy of the prediction model for H2 is much lower relative to the prediction models for all other gases, which may be due to the fact that when constructing the model, the parameters are not optimal, or the fit of H2 in the iso-interval data, the fitting curve of H2 is not accurate enough and needs further adjustment. However, it can be seen from the table that the prediction model established in this paper basically meets the prediction requirements with high accuracy and strong practicality.
Test results
Name | Time | Actual value | SVR/BP Predictive value | RE% of SVR/BP | MAPE% of SVR/BP | |||
---|---|---|---|---|---|---|---|---|
H2 | 1.39 | 20.51 | 20.937 | 20.902 | 4.12 | 5.33 | 4.04 | 5.14 |
2.01 | 20.59 | 20.96 | 20.72 | -3.99 | 5.32 | |||
C2H6 | 0.67 | 111.8 | 111.503 | 111.812 | 0.66 | -0.079 | 1.925 | 0.301 |
1.89 | 113.52 | 112.57 | 112.43 | 1.91 | 1.46 | |||
C2H4 | 1.48 | 64.41 | 64.12 | 64.95 | 1.145 | -2.2 | 0.9975 | 1.735 |
2.31 | 64.89 | 64.285 | 64.89 | -0.59 | 2.9 | |||
CH4 | 0.98 | 31.86 | 32.122 | 31.924 | 2.5 | 3.6 | 2.502 | 3.43 |
2.47 | 31.6 | 31.74 | 32.18 | 1.184 | -3.19 |
The article is about fault diagnosis of electrical equipment and fault prediction models, and it is intended to provide relevant research work through example analysis to prove the practicality of the two models proposed in this paper.
Using the prediction model proposed in this paper to predict the fault characteristics of H2, it is found that the final prediction results are closer to the actual results, and the error is relatively small, with the relative error RE% of 3.51% and -3.68 respectively, and the average absolute error MAPE% of 3.3%.
In the fault characteristic prediction experiments built according to the optimal dimensions and optimal parameters of different gases, except for the prediction of C2H6, the BP prediction model has a higher accuracy than the prediction model proposed in this paper, the rest of them are all lower than the accuracy of the fault prediction model for electrical equipment constructed in this paper based on the lifting-limit learning machine, which can be obtained from the fact that the prediction model built in this paper has a good prediction performance.