Application and Optimization of Equipment Abnormality Early Warning and Emergency Response Mechanism in Power Supply Service
Pubblicato online: 19 mar 2025
Ricevuto: 26 ott 2024
Accettato: 02 feb 2025
DOI: https://doi.org/10.2478/amns-2025-0524
Parole chiave
© 2025 Chaofan Hou et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Electricity is the infrastructure of the city, and is an important energy source to maintain the operation of the city. Electricity energy is related to national development and economic livelihood, has penetrated into all aspects of social development, the stable supply of electricity is an important guarantee for the safe production of the whole society, the public live in peace and economic growth of high quality [1-2]. And urban power supply is one of the important contents of urban public utilities, urban power supply emergency management can provide basic protection for urban public utilities emergency management [3-4]. Therefore, the emergency response to power public emergencies is related to whether the society is smooth, harmonious and safe development, and to the construction and sustainable development of the city [5].
At present, extreme weather, natural disasters, public health and social opinion events occur. Network and information system emergencies may occur internally in the electric power network and information system due to system software and hardware equipment failure, aging and overloaded operation, and personnel violations [6-8]. Externally, there may be unexpected events such as malicious attacks, infiltration, and theft of application data by hostile forces and unruly elements or damage to system data and infrastructure caused by natural disasters and external damage [9-11]. At the same time, with the continuous advancement of unmanned substation and centralized monitoring, the power monitoring system, industrial control system and automation system are subject to malicious invasion and attack, all of which may cause power emergencies public events [12-13].
In order to ensure the smooth operation of the power system in emergency situations, it is important to do a good job in the emergency management of the power grid. However, the study of public power emergencies is still mostly based on the government and social organizations of the emergency policy and plan level, the lack of power technology level of emergency management research for urban power emergencies in public emergency response mechanism is not yet perfect, according to the local conditions of the operability is not strong [14-17]. As a result, the response to power emergencies in the process of handling public events, easy to cause repair untimely, public opinion outbreaks, social panic and other problems, directly affecting the health and stability of the social economy [18-21]. Therefore, the optimization of the research on the early warning and emergency response mechanism of electric power emergencies and public events can make a breakthrough, which will be able to escort the economic development.
China's power grid public crisis management focuses on the disposal mechanism, the study of preplanning, and the optimization mechanism exploration in resource dispatching, security precautions, and other means of handling. Guan, C. et al. introduced the relevant regulations and specific requirements of the power emergency response mechanism, and improved the knowledge system of the power emergency command, clarified the responsibility of the institutions, equipments, and other elements in which the main body of the responsibility of the top decision-making aspect of the electric power emergency response system is designed in terms of top-level decision-making [22]. Wu, Y. et al. formulated an urban distribution network contingency plan for high-impact low-probability natural disaster events, which is based on a multi-stage resilience enhancement framework that can strengthen risk prevention and control based on historical data and effectively reduce the second-order impacts of disasters [23]. Schmitz, M. et al. investigated emergency ordering for electric power maintenance in the form of islanding operation scheduling problem, through the establishment of an optimization model to make the scheduling in the premise of meeting the power constraints to achieve the lowest travel cost and impact consequences [24]. Wang, Y. et al. established a rural emergency demand response mechanism with the goal of utility maximization, to meet the user's active demand under the premise of rapid reduction of distributed loads, and improve the stability of the rural power system [25]. However, most of the current research focus on emergency response to public power emergencies is biased towards policy and plan design level research, with a slight lack of automated monitoring and early warning of power grids in the smart era.
In this context, some scholars have conducted research on the integration of power system and emergency management from the perspective of power expertise. Huang, G. et al. constructed a two-stage robust mixed integer optimization model and applied it to the resilient response framework of smart grid to combine situational awareness and resilience enhancement to achieve fast response of power system under preventive state and contingency scenarios [26]. Cui, H. et al. examined the role played by the optimal load allocation policy in emergency demand response (EDR) for thermoelectric microgrids, which was able to generate real-time load shedding schedules tailored to the EDR events and optimize the peak loads of thermoelectric microgrids [27]. Huang, Q., et al. designed a deep adaptive emergency control scheme for power systems, which is based on a grid-controlled reinforcement learning platform and is able to simulate distribution network scenarios based on equipment parameters, significantly enhancing the adaptive nature of power system emergency control [28]. Wang, J. et al. proposed an overload tolerance prediction and heavy load warning method for transmission equipment, which effectively avoids power equipment from reaching an overload state to ensure its operational safety, and provides transfer capability support for the safety of the power system in emergency situations [29]. Amroune, M. developed an emergency demand response strategy for power systems based on the whale optimization algorithm, which effectively reduces the risk of instability in power supply by maintaining the voltage stability margin within an acceptable range during emergencies [30]. Xiao, T. et al. investigated a real-time decision-making emergency control system for power grids, which strengthened the reliability of power grids by the stability restoration criterion with a fast decision-making algorithm for grid faults that enabled the system to determine the control strategy for contingency conditions in a short period of time [31]. However, the above scheme lacks robustness in the face of unexpected grid events characterized by uncertainty and variability, and its application and optimization in specific power supply services still need to be improved.
Aiming at the higher demand for equipment abnormality warning and emergency response mechanism in the current power supply service. In the equipment abnormality warning link, this paper firstly chooses digital twin technology to realize the visual display of fault state, and secondly real-time data collection and transmission of faulty equipment through MQTT protocol. Finally, two methods of multivariate correlation modeling and multivariate time series reconstruction are applied to the anomaly detection model discriminator, and the combination of the three is used to construct a set of equipment anomaly early warning model that can realize immediate and effective alarm. The real-time and accuracy of this paper's early warning model is tested in the experiment. Subsequently, this paper combines the operation of the anomaly detection early warning model to sort out the scientific emergency response and disposal process, and analyze the performance of the response mechanism in actual cases.
This chapter focuses on three perspectives of equipment fault state visualization, collection of real-time data from equipment, and accurate fault warning, respectively introduces digital twin technology, MQTT protocol, detection model discriminator based on multivariate correlation modeling and multivariate time-series reconstruction, and combines the three and constructs the anomaly detection and warning model in this paper.
From an application point of view, digital twin technology is a strong driver for the development of equipment anomaly warning. In the process of realizing equipment anomaly early warning, it is necessary to model the existing power supply elements (people, machines, materials, methods, and environments, etc.) and evaluate the risk factors associated with them based on these elements. In practice, for example, comparing the state of the equipment model with the state of the actual equipment at the next moment will predict the production quality of the product. The data predicted by the equipment model is also compared and analyzed with the actual data, and the early warning activities of the equipment are adjusted and optimized in light of the actual situation, so as to achieve a stable, continuous and high-quality power supply service. In order to make an accurate assessment of risk factors based on power supply elements, it is necessary to construct a reasonable and comprehensive element model, and digital twin technology is the best choice for constructing such models. After the model of anomaly detection is established by digital twin technology, the model can help the power supply industry to realize real-time monitoring and adjustment of the system power supply process, improve the stability and longevity of the equipment power supply, and then help the power supply industry to provide better quality power supply services.
In this paper, the data collection system of the device is designed based on the MOTT protocol. The system architecture of this data collection is mainly composed of three parts: data publisher (device entity and MOTT client), MQTT proxy server, and data subscriber (system client and virtual device). During system operation, the roles between data publishers and data subscribers can be transformed according to the actual situation. In other words, both data publisher and data subscriber have the functions of publishing information and subscribing to information.
The main functions of each component of the system are as follows.
Data publisher (device entity and MQTT client). Responsible for packaging the sent data and sending it to the specified topic in the MQTT proxy server.
MQTT proxy server (MQTT server). It is responsible for forwarding the data sent by the data publisher to the data subscriber, and the MQTT proxy server is the core of information processing in the data collection system, with the function of receiving information and sending information.
Data subscriber (system client and virtual device). The data subscriber is responsible for subscribing to a specified topic in the MQTT proxy server to obtain the information received by the topic. After receiving the information, the data subscriber unpacks the received information.
In this paper, data caching refers to caching in a broad sense. Compared with disk storage, data caching has certain advantages in terms of transmission stability, information access speed, and availability. In the application process, the cache will store useful data to avoid frequent database additions, deletions, checks and changes, which improves the stability of information transmission to a certain extent. When Mysql and other persistent services are temporarily unavailable, the cache can maintain a certain time end to continue to provide information access services, which enhances the availability of data transmission. The current application of more caching services is Redis, Redis can support a variety of data structures and can do data persistence, in the operation of the device will inevitably occur in the data error, so the data transmission function needs to have the ability to trace the data. It so happens that Redis has a data persistence function, which can meet this demand. Therefore, this paper proposes a data transmission method based on Redis, and the overall process of data transmission is shown in Figure 1.

Data transmission process
When the real-time data of the device is collected the Redis database will be used as a temporary storage, and the real-time data will be transferred to the virtual device through Redis, so as to facilitate the timely update of the operating status of the virtual device. When the system receives new device data, it will store the device data as historical data in the MySQL database after simple processing, and then replace the real-time data of the sensor in the MySQL database with the device data to realize the effect of updating the real-time data of the device in the display page of the system. This data transfer method can effectively utilize the advantages of Redis and MySQL. In the process of use, the Redis database with fast read/write speed can meet the demand of obtaining real-time data in time when the virtual device is running, while the MySQL database with high storage capacity can store a large amount of historical data of the device in order to facilitate the data statistics in the later stage.
Usually, the general steps of data preprocessing are: firstly, extract the collected data, then identify and repair the lost data, at the same time, identify and modify the wrong data which is obviously caused by the abnormal sensor reading, and finally, according to the research method and the needs of the model, select the corresponding features and data to be integrated and extracted for processing.
Lost Data Processing
Firstly, identify whether there is any lost data in the data set. WTG data contains many cases of lost sensor data, which may be due to the planned shutdown of some sensors or the failure of data reading, resulting in the inability to read, due to the loss of a large magnitude, it is unrealistic to repair the replacement, but the lost data usually occurs in the nacelle and the internal relationship of the variables and shows a centralized distribution of the characteristics of the data selection and model training will not constitute a substantial impact on the data. We remove the centrally distributed missing data. We delete the centrally distributed missing data and replace the scattered missing data with the mean values of two adjacent time periods.
Hydropower unit dataset is a public dataset, and its data is processed by the original author and then made public, so there is no data loss, but it often has failure points and planned to get the machine, resulting in frequent discontinuity of the time series, which has a greater impact on the construction of the time series model, and we selected about 34,000 data in which the interval of the time interval is relatively flat, the whole of the following year in September, and then the month of February, as a five-chapter time forecasting method of the data set.
Error data processing
There are abnormal data in the dataset caused by sensor failures, which need to be identified and processed, because the wrong data will lead to the failure of the whole model application. The generator speed in the wind turbine data is usually 1000rpm (rotational speed) or more, and the default speed is 1000, but some of them will be 0rpm, which is because it is in a windless environment, and the generator is no longer running to produce electricity. However, when constructing the model, when the generator drops from 1000rpm to 0rpm, the corresponding other temperatures such as the bearing temperature and the cabin temperature do not appear to change, if we directly input the 0rpm into the network, it is obvious that it will not be able to fit the situation, so we will use the 1000 as a threshold, and the value of the generator speed is less than 1000 is replaced by 1000 uniformly.
Data integration and extraction processing
There are numerous sensor parameters in the WTG dataset, and the sensor parameters that can be used for misfire analysis are main bearing temperature, cabin temperature, generator speed, wind speed, outdoor temperature, and non-driven bearing temperature. These variables are used as input feature parameters for the model to be trained and tested.
The hydroelectric unit data set has six monitoring variables, four of which are bearing vibration in four different directions (axial, vertical, horizontal radial, and coupling), which show a strict positive correlation, so we combine these four variables and use the arithmetic average method as one variable, and the other two variables are the real-time power of the generator and the inflow of the bearing's hydraulic lubrication unit, respectively. The other two variables are the real-time power of the generator and the inflow of the hydraulic lubrication unit of the bearing. The three integrated variables are used as inputs and predictors of the model for training and testing.
Previous studies have shown that, unlike univariate time series, it is crucial to consider the relationship between multiple variables when analyzing multivariate time series, and that the relationship between different components can characterize the state that the equipment is in. It has been observed that equipment in an abnormal state tends to show an inconsistent change between multiple variables. Therefore in order to utilize the correlation between multiple variables to capture the state a complex equipment system is in, this paper computes a correlation map (CMap) for each subsequence using the Pearson correlation coefficient. For a given subsequence
In order to capture the correlation between multivariate variables, in the previous subsection, this paper computes a correlation plot for each time subseries, ignoring the time-dependent information of the time series. And previous studies have shown that the temporal dependence of time series is also very useful for anomaly detection. Therefore, in order to improve the accuracy of the anomaly detection module, this paper also utilizes the temporal dependency feature to reconstruct the original time subsequence.
Recurrent neural network RNN has been widely used for temporal feature processing of time series, but it suffers from the problem of gradient vanishing as the length of time series increases. The long and short-term memory neural network LSTM and the gating unit GRU, as variants of RNN, solve the problem of gradient vanishing well by adding a gating mechanism. Compared with LSTM, GRU has fewer parameters and is easier to train, so it is more appropriate to use GRU in real-time complex scenes. GRU has only two activation gates: the update gate and the reset gate, the update gate decides how to combine new inputs with previous memories, and the reset gate decides how much of previous memories need to be kept. In this paper, GRU is used as the basic structure of a time series reconstructor, MTS-G, to reconstruct multivariate time series. MTS-G consists of an encoder and a decoder. The encoder is responsible for extracting the timing-dependent features of the time series while the decoder decodes the multivariate time series based on the features extracted by the encoder. The encoder is responsible for extracting the timing-dependent features of the time series while the decoder decodes the multivariate time series based on the features extracted by the encoder.
As in equation (2), the encoder captures the mapping of time series
The decoder is able to capture the mapping from
A discriminator is able to distinguish between the false samples generated by the generator and the real samples of the input, in this paper, we train the generator to reconstruct the samples by adding a discriminator. In this paper, the discriminator classifies the input samples by extracting both the correlation between multivariables and the time dependence of the time series. Similar to the network structure in the generator, it consists of a CNN and a GRU. The CNN is used to extract the correlation feature
For the training of the anomaly detection model CT-GAN, in this paper, the following loss function is used to train the discriminator and generator as in equation (4):
The loss function consists of two main components, a discriminator loss and a generator loss. When keeping the generator unchanged to train the discriminator, this paper expects the discriminator loss to be maximized, and the discriminator is able to recognize the input normal samples, i.e.,
After iterative training, the generator has the ability to generate normal samples, while the discriminator has the ability to distinguish between normal and abnormal samples at the same time. As in equation (5), the abnormal score consists of the generator score and the discriminator score. The generator score indicates the reconstruction error between the generated sample and the input sample, including the multivariate correlation graph reconstruction error and the time series reconstruction error, and the discriminator score indicates the probability that this sample is an abnormal sample.
Where,
Taking the supply fan as the research object of this experiment, the experimental data are collected from a power plant in a certain place during two periods of time, January 2021-February 2021 and December 2022-July 2023, with a frequency of 10 min sampling 1 point, and the unit load, atmospheric temperature, inlet flow and outlet pressure are taken as inputs to the warning model. Characteristics such as unit load, atmospheric temperature, inlet flow and outlet pressure are used as inputs to the early warning model, and important characteristics that need to be monitored are used as outputs of the early warning model. The first section of data is the real abnormal data of the equipment, and the second section of data is the data when the equipment is in the normal state. The experiment first cleans all the data and deletes the points under the states of bad value null value, abnormal jump and equipment shutdown, etc. Afterwards, 85% of the sample data in the latter section of the data is selected as the training set, 10% as the validation set, and the remaining 5% as the testing set to test whether the model will generate false alarms for the normal data to produce false alarms, and selecting the data in the previous segment also as a test set to test whether the model will produce underreporting for abnormal states. Root Mean Square Error, Mean Absolute Error and R2 are used as metrics to assess the performance of the warning model.
In the experiment the EPOCH for model training is set to 20, BATH_SIZE is set to 100, the prediction step in LSTM is set to 50, and finally the model structure is trained by Auto-Keras.
Experimentally, the blower current, blower bearing temperature and blower motor coil temperature are selected as the features for early warning monitoring. Observe the curve between the evaluated value and the actual value of the output of the early warning model under normal operating conditions. Figure 2 shows the comparison between the evaluated and actual values of the blower current model. Figure 3 shows the comparison between the evaluated and actual values of the blower bearing temperature. Meanwhile, the root mean square error, average absolute error and R2 of each feature are calculated. From Figure 2 to Figure 3, it can be seen that during the period of test data, the equipment is under normal operating conditions, and the assessment value output from the early warning model fits the curve of the actual value better, and the model can predict the operating values of the features under normal operating conditions better. For temperature, a feature that changes slowly, the model's calculation error is small, and the R2 can reach 0.96, which has a good prediction effect. For the feature of current, which changes faster and is prone to fluctuations, the model also predicts better, with an R2 of around 0.77.

Current model evaluation value and actual value

Comparison of bearing temperature evaluation value and actual value
By calculating the residuals between the predicted value and the actual value, we determine whether the residuals are within the threshold band for the alarm of warning, as shown in Fig. 4 to Fig. 5. The upper and lower dashed lines in Figs. 4 to 5 are alarm lines, calculated from the errors of the feature parameters on the validation set according to the 3σ criterion. As can be seen from the figure, there is a small number of false alarms in the case of the blower current because of the large fluctuation and frequent fluctuation, and the false alarm rate is 0.75%. Because the early warning pays more attention to the trend of the important characteristics of the equipment over a period of time, if the equipment produces anomalies, the value of the relevant characteristics will deviate from the normal range for a long period of time, so we can consider changing the alarm triggering conditions to the residuals exceeding the threshold band a certain number of times over a period of time will be alarmed, and appropriately reduce the sensitivity of the alarm triggering to eliminate false alarms due to the fluctuation of a single parameter, which will not lead to the warning alarms. The accuracy will not be decreased.

Upper and lower alarm limits for supply fan current

Upper and lower temperature alarm limits for fan bearing
In the power supply service, after the emergency response obtains the warning information, the relevant personnel will combine the fault handling suggestions and emergency response measures in the warning information to handle the fault. This chapter develops the following contents on the emergency response process mechanism and actual case analysis.
Early warning information release
When the fault detection and localization module determines that there is a potential fault, the warning release and emergency response module will immediately start the warning information release process. The module sends warning information to relevant personnel through multiple channels to inform them of the specifics of the fault and possible impact. The early warning information will also contain detailed fault handling suggestions and emergency response measures to guide the relevant personnel to respond quickly and handle the fault.
Emergency Response and Disposal
According to the troubleshooting suggestions and emergency response measures in the early warning message, the relevant personnel will quickly take corresponding actions to dispose of the fault. The system provides real-time monitoring and feedback functions to ensure the timeliness and effectiveness of fault disposal. Through the efficient emergency response and disposal process, the system can minimize the impact of faults on the normal operation of the system.
In this paper, a residential-commercial area in a region is used as an example to simplify the power consumption area into a simplified busbar system as shown in Fig. 6. There are 20 loads in this system, numbered 1-20 are load nodes without generators and all power is taken by power nodes. The load importance is categorized into primary, secondary and tertiary levels with corresponding weights of 50, 20 and 1, respectively.

System emergency flow chart
In this paper, Matlab 7.1 is used for simulation and calculation, and the main control interface diagram of the system is shown in Fig. 7.

Simplified map of bus system in an area
Weak point ranking
Measuring the operating parameters of the system, there are five node voltages out of range at a certain point in time, calculating their node weak probability, voltage drop expectation and node voltage expectation, and sorting them from high to low, as shown in Table 1.
High-risk assessment
Assuming that the typhoon is about to enter the region, the probability of node outage is calculated based on the disaster database matching with the intensity of the disaster and the equipment resistance. From the load rating and daily load data, the top 5 nodes of system outage risk are obtained as shown in Table 2, in which nodes 3 and 9 are primary loads with a value coefficient of 60; nodes 7, 12 and 10 are secondary loads with a value coefficient of 25.
Fault diagnosis
Transformer failure occurs at a certain moment, the gas content in the oil is shown in Table 3, and the fault is judged to be a high-energy discharge.
Rescue optimal decision
Numbers F1-F4 are the fault points, and the four faults caused 10 loads in the distribution system to lose power. The importance level of the lost loads is shown in Table 4.
System node weak spot sort
Sort | Node | Node weakness probability | Voltage drop expectation /p.u | Node voltage expectation /p.u |
---|---|---|---|---|
1 | 5 | 0.7798 | 0.2914 | 0.3922 |
2 | 11 | 0.6460 | 0.1811 | 0.3698 |
3 | 8 | 0.3221 | 0.1799 | 0.3587 |
4 | 9 | 0.0450 | 0.1569 | 0.3346 |
5 | 6 | 0.0123 | 0.1345 | 0.3248 |
Power failure risk ranking of system nodes
Sort | Node | Load capacity /kW | Blackout probability | Risk / 10,000 yuan |
---|---|---|---|---|
1 | 3 | 10890 | 0.0915 | 49520 |
2 | 9 | 3850 | 0.2603 | 19560 |
3 | 7 | 790 | 0.5545 | 8660 |
4 | 12 | 2225 | 0.1958 | 15748 |
5 | 10 | 5675 | 0.1036 | 26479 |
Gas content in transformer oil
Gas | Content/(μL/L) |
---|---|
H2 | 31.2 |
CH4 | 6.5 |
C2H6 | 19.1 |
C2H4 | 4.6 |
C2H2 | 67.2 |
Power loss load class
Load node class | Lost power load node |
---|---|
1 | 3, 4, 11, 16, 17, 2 |
2 | 1, 5, 6, 7, 8, 10, 12, 13, 15 |
3 | 14, 20, 19, 18, 9 |
The order of repair under the principle of shortest recovery time: F1→F2→F3→F4. The order of repair under the principle of lowest economic loss: F1→F3→F2→F4.
The main research content of this paper is as follows:
Aiming at the problem that the equipment failure state cannot be visualized and displayed, digital twin technology is used to model the physical equipment as well as the scene in which it is located, and the scripts and equipment data are used to drive the virtual equipment to simulate the real-time state of the equipment in the virtual space. Aiming at the problem of difficulty in collecting real-time data from devices, the MQTT protocol is utilized to collect data from early warning devices in power supply service, and the data collection system architecture based on the MQTT protocol is designed to establish the workflow of data collection; after collecting data from the devices, the real-time data is transferred to the virtual device through Redis and stored in the MySQL database; Aiming at the problem of anomalies in the historical data of equipment, anomalous data detection method based on OneClassSVM algorithm is proposed to validate it. Aiming at the problem of unstable and imprecise equipment fault warning, firstly, the data are integrated and extracted for processing, then the potential anomalies in the multivariate time series generated during the operation of the equipment are detected, the detected anomalies are accumulated, the anomalous accumulation pattern of the faulty equipment is mined, and finally the reliable warning of the fault is carried out according to the accumulated results. By improving the system fault database, a discriminator that can effectively identify the abnormal state characteristics of the equipment is designed and applied to the abnormality detection and warning model.