H7N9 avian influenza is a disease caused by the novel H7N9 recombinant avian influenza virus, which is mainly characterised by pulmonary inflammation and fever (7). In early 2013, human infection with H7N9 avian influenza virus was discovered for the first time in China. Subsequently, birds infected with the virus were found in live poultry markets (25). At the beginning of 2017, a highly pathogenic variant of it was found in poultry flocks in some Chinese provinces, which then caused a number of outbreaks (32). In order to control H7N9 avian influenza in commercial flocks, China imposed treatment with a bivalent inactivated vaccine of recombinant avian influenza virus (H5+H7) in September 2017 (24). Both low-pathogenic and high-pathogenic strains of H7N9 avian influenza virus coexist and are still evolving at present (5). Therefore, the risk of its spread among various animals is not countered fully by vaccination and should not be ignored.
It is very important to construct predictive models for the early detection of H7N9 outbreaks (22), and many studies have established them. For example, Gilbert
Continuous monitoring of the distribution and changes in the positivity rates of H7N9 avian influenza virus has been significant for eliminating infection sources (3). Positive virus detection indicates viral infection in poultry, and the higher the positivity rate (number of positive results/number of samples), the greater the avian influenza virus infection (26). Infected poultry have been the main source of avian influenza virus. After infection, the poultry shed the virus through saliva, nasal secretions and faeces, causing the spread of the epidemic (14). Also, it is possible that contacts of domestic poultry with migratory birds, which are natural reservoirs for avian influenza viruses, are a potential source of infection (12). Relevant studies have shown that the rate of avian influenza virus detection correlated with outbreaks to some extent,
China has been a major exporter of poultry products, the trade in which may spread domestic avian influenza virus to importing countries (34). Furthermore, there is a wide range and distribution of bird migrations within and in proximity to China, which possibly causes the spread of domestic avian influenza to neighbouring countries (6). The outbreak risk of H7N9 avian influenza related to China has obliged the Chinese government to employ effective preparedness measures, including early detection and timely sharing of surveillance data (22). The Ministry of Agriculture and Rural Affairs of the People’s Republic of China has organised provincial veterinary departments to conduct special surveillance efforts against H7N9 avian influenza. Since April 2013, fixed virus monitoring points have been established on each province’s poultry farms and commercial livestock farms and in pig slaughterhouses and wild bird habitats. The collected samples are sent to the National Avian Influenza Reference Laboratory for testing. When a virus detection in animal or environmental samples occurs, a retrospective investigation is carried out in accordance with guidelines issued by the Ministry of Agriculture and Rural Affairs to identify the source of the virus and possible contaminated sites and implement culling or treatment of pathogen-positive poultry flocks (18). As the official monitoring data are relatively authoritative, some studies have tried to use the annual avian influenza monitoring data of the United States (28) or Southeast Asia (17) to predict the prevalence of the virus. This study regards China and also employs national monitoring data for the virus positivity rates to analyse the distribution characteristics of positive samples and construct a predictive model conforming to the trend of changes in the positive rate. We believe that this study has novelty and important practical significance for improving the prediction of major animal epidemics and curbing the spread of the virus. It could also provide an important insight into how the global known and future zoonotic influenza threats may be reduced.
Monitoring efforts for H7N9 avian influenza by the Ministry of Agriculture and Rural Affairs of the People’s Republic of China involve collecting chicken, duck, goose, wild bird and pig samples from certain sites for serological and pathogenic analysis. It should be noted that since September 2017, China has implemented comprehensive H7N9 immunisation for all domestic poultry operations, and the original positivity rate of serological samples has been replaced by a qualified rate. Consequently, the statistical calibre of pathogenic monitoring data has not changed, so the data could be used as an observation indicator of positivity rates. The rates in samples from surveillance for this pathogen from April 2013 to April 2020 were collected from information published by the Ministry of Agriculture and Rural Affairs (see Supplementary Materials for details of raw data). In addition, the affected species, number of H7N9 avian influenza outbreaks, number of cases, number of fatalities and number of culled animals reported by the provinces were obtained from the animal epidemic monthly report in the ministry’s Veterinary Bulletin (19). In order to have a clear understanding of the distribution pattern of non-human H7N9 virus detections in China and to conduct spatial autocorrelation analysis, the monthly positive rates greater than 0 were all set to 1, indicating that H7N9 virus had been detected. A value of 0 means that no virus detections were reported.
Hierarchical Structure of WPD
Least squares support-vector machines (LS-SVM) is a simplified and improved version of the standard SVM model. Its greatest advantage is dealing with non-stationary series and overcoming the problem of overestimated prediction results from traditional prediction methods (27). In this paper, Matlab 2018a software (MathWorks, Natick, MA, USA) and LS-SVM lab Toolbox version 1.8 (http://www.esat.kuleuven.be/sista/lssvmlab) were used to model the positive sequence data of avian influenza virus. The algorithm principle is as follows:
Suppose there are
In the formula,
The linear regression function of the sample set is
In the formula,
Based on the principle of structural risk minimisation, the optimisation problem is defined as
In the formula,
According to the Karush–Kuhn–Tucker conditions and Mercer’s condition, the optimisation problem of LS-SVM is transformed into the solution of the linear equation, and the nonlinear regression equation of the LS-SVM model at the
In the formula,
In the formula, σ is the hyperparameter of the radial basis function kernel.
In the formulas,
In the formula,
Situation of H7N9 avian influenza epidemic from April 2013 to April 2020
Time | Province | Number of outbreaks | Poultry | Number of cases | Number of fatalities | Number of animals culled |
---|---|---|---|---|---|---|
June 2017 | Inner Mongolia | 2 | Chicken | 63,406 | 37,582 | 424,197 |
June 2017 | Heilongjiang | 1 | Chicken | 20,150 | 19,500 | 16,610 |
August 2017 | Anhui | 1 | Chicken | 1,368 | 910 | 74,463 |
March 2018 | Shaanxi | 1 | Chicken | 1,000 | 810 | 1,000 |
April 2018 | Shanxi | 1 | Chicken | 812 | 699 | 6,374 |
April 2018 | Ningxia | 1 | Chicken | 1,200 | 585 | 13,578 |
May 2018 | Liaoning | 1 | Chicken | 11,000 | 9,000 | 8,000 |
May 2018 | Ningxia | 1 | Chicken | 3,000 | 2,210 | 86,000 |
March 2019 | Liaoning | 1 | Peacock | 9 | 9 | 191 |
Data source:
Spatial autocorrelation analysis of H7N9 avian influenza virus detections in China
Variable | Moran’s |
Z score | P value |
---|---|---|---|
From 2013 to 2020 | −0.014 | 0.387 | 0.350 |
In 2013 | −0.019 | 0.259 | 0.398 |
In 2014 | −0.046 | −0.362 | 0.359 |
In 2015 | 0.042 | 1.783 | 0.037 |
In 2016 | −0.064 | −0.992 | 0.161 |
In 2017 | −0.034 | −0.075 | 0.470 |
In 2018 | −0.007 | 0.607 | 0.272 |
In 2019 | −0.021 | 0.239 | 0.406 |
In 2020 | — | — | — |
In January | −0.008 | 0.507 | 0.306 |
In February | −0.008 | 0.526 | 0.299 |
In March | −0.006 | 0.563 | 0.287 |
In April | −0.004 | 0.586 | 0.279 |
In May | −0.040 | −0.210 | 0.417 |
In June | −0.045 | −0.350 | 0.363 |
In July | 0.009 | 1.103 | 0.135 |
In August | — | — | — |
In September | −0.073 | −0.996 | 0.160 |
In October | −0.036 | −0.164 | 0.435 |
In November | −0.046 | −0.366 | 0.357 |
In December | 0.072 | 2.320 | 0.010 |
Before comprehensive animal H7N9 immunisation | −0.020 | 0.255 | 0.399 |
After comprehensive animal H7N9 immunisation | −0.028 | 0.058 | 0.477 |
No H7N9 avian influenza virus detections occurred in China in 2020 or in August of each year, so there are no spatial autocorrelation analysis results presented for them
Regional distribution of H7N9 avian influenza virus detections in China
The annual regional distributions of H7N9 avian influenza virus detections in different years are shown in Fig. 3. In 2013, the virus was mainly detected in southeast China, and by 2016 the positive rate had reduced. Due to the large number of live poultry markets and frequent migration of wild birds, the time span of H7N9 virus detections in animals in Guangdong province exceeded six months in each year from 2013 to 2016. In 2017, owing to the coexistence of low pathogenicity and high pathogenicity H7N9 influenza virus strains in various animals, the area of virus detection expanded, and the number of areas with virus detections recorded for more than six months increased to an unprecedented level. In 2018, only three southern provinces were found to be positive for the virus, while in 2019, the detection areas were mainly in Inner Mongolia and Liaoning provinces, and after 2020, no virus was detected.
Regional distribution of H7N9 avian influenza virus in China from 2013 to 2020
Fig. 4 shows that March, April and May were the months in which samples often tested positive in many areas, while in August, September and October, the virus could be detected in fewer areas. This pattern indicates that the H7N9 virus was most prevalent in late spring and early summer. Although the temperature increased in this period, there was still a high risk of H7N9 outbreak in animals.
Regional distribution of cumulative H7N9 avian influenza virus detections in China from January to December between 2013 and 2020
As can be seen in Fig. 5, before the national animal comprehensive H7N9 immunisation programme began in September 2017, most regions were positive for H7N9 virus, especially Guangdong, Fujian and Zhejiang. The cumulative positive rate of H7N9 avian influenza virus was more than 10 months in these provinces. After immunisation, the poultry produced antibodies, which reduced the positivity rate. Other than Fujian province, there were no areas where H7N9 avian influenza virus had been detected for more than four months. This indicates that the comprehensive H7N9 immunisation effort had a significant effect on reducing the positivity rate of H7N9 avian influenza.
Regional distribution of H7N9 avian influenza virus detections before and after comprehensive H7N9 immunisation of at-risk animals in China
The MI values of H7N9 avian influenza virus detections are shown in Table 2. There was no significant spatial autocorrelation among such detections in 31 provinces from 2013 to 2020. Moreover, the positive spatial correlation of avian influenza virus detections before and after the comprehensive H7N9 immunisation programme was not significant. Among different years, only virus detections in 2015 had significant spatial autocorrelation (P < 0.05); and among different months, only December had (P < 0.05).
Temporal distribution of positive rate of animal H7N9 avian influenza virus detections in China from 2013 to 2020
The augmented Dickey–Fuller (ADF) test results showed that the time series of the positive rate of H7N9 avian influenza virus detections failed the significance test of the coefficient at the significance levels of 1%, 5%, and 10%. Since this indicated that the time series was not smooth, an LS-SVM-ARIMA combined model based on WPD was then constructed. The specific modelling steps are shown in Fig. 7.
Steps for constructing the LS-SVM-ARIMA combined model based on WPD
LS-SVM – least squares support-vector machines; ARIMA – autoregressive integrated moving average
Given the limited data, our study used the monthly data from April 2013 to April 2019 to build the combined model, and the data from May 2019 to April 2020 as the prediction data. The monthly data from April 2013 to April 2019 were decomposed by a three-layer wavelet packet by Db4 wavelet, and the positive rate data of eight decomposition spaces were obtained. Among them, the four low-frequency trend sequences (
Low-frequency trend sequence decomposition of positive rates of H7N9 avian influenza virus from April 2013 to April 2019
High-frequency trend sequence decomposition of positive rates of H7N9 avian influenza virus detections from April 2013 to April 2019
Optimisation results of low-frequency trend sequence parameters
Decomposition sequence | Hyperparameters σ | Normalisation parameters |
---|---|---|
[3,0] | 9506 | 7350.8 |
[3,1] | 2514.5 | 3946.2 |
[3,2] | 9113.6 | 9740.8 |
[3,3] | 818.709 | 7242.1 |
ARIMA model parameters
Decomposition sequence | ARIMA model parameters |
---|---|
[3,4] | ARIMA (1,0,0)*(2,0,0) [12] |
[3,5] | ARIMA (3,0,0) |
[3,6] | ARIMA (0,0,0) |
[3,7] | ARIMA (3,0,0) |
ARIMA – autoregressive integrated moving average; ARIMA (1,0,0)*(2,0,0) [12] is the multiplicative seasonal model of ARIMA
In the formula,
According to the formula above, the predicted values of each decomposed sequence were accumulated. As shown in Table 5, the prediction results of the LS-SVM-ARIMA combined model based on WPD were obtained.
Prediction results of the LS-SVM-ARIMA combined model based on WPD
Date | Real value | [3,0] | [3,1] | [3,2] | [3,3] | [3,4] | [3,5] | [3,6] | [3,7] | Predictive value |
---|---|---|---|---|---|---|---|---|---|---|
05/2019 | 0 | 0 | 0 | 0.001 | 0 | 0 | 0.001 | 0 | 0 | 0.002 |
06/2019 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0.001 | 0.001 |
07/2019 | 0 | 0 | 0.001 | 0 | 0 | 0 | 0 | 0 | 0 | 0.001 |
08/2019 | 0 | 0.001 | 0 | 0 | 0.001 | 0.001 | 0 | 0 | 0 | 0.003 |
09/2019 | 0 | 0 | 0 | 0 | 0.001 | 0 | 0.001 | 0 | 0 | 0.002 |
10/2019 | 0.001 | 0.001 | 0.002 | 0.001 | 0.001 | 0.001 | 0.002 | 0 | 0.001 | 0.010 |
11/2019 | 0 | 0.001 | 0.001 | 0 | 0 | 0.001 | 0 | 0 | 0 | 0.003 |
12/2019 | 0 | 0.001 | 0 | 0.001 | 0.001 | 0 | 0.001 | 0 | 0.002 | 0.006 |
01/2020 | 0 | 0 | 0.001 | 0 | 0.001 | 0.001 | 0.001 | 0 | 0 | 0.004 |
02/2020 | 0 | 0.001 | 0 | 0 | 0 | 0.002 | 0 | 0 | 0 | 0.003 |
03/2020 | 0 | 0 | 0.001 | 0 | 0 | 0 | 0 | 0 | 0 | 0.001 |
04/2020 | 0 | 0 | 0.001 | 0 | 0 | 0.001 | 0 | 0 | 0 | 0.002 |
LS-SVM-ARIMA – least square support-vector machines–autoregressive integrated moving average, WPD – wavelet packet decomposition
Prediction results based on the three models: ARIMA, LS-SVM and ARIMA-LS-SVM
Date | Real value | ARIMA model | LS-SVM model | LS-SVM-ARIMA model |
---|---|---|---|---|
05/2019 | 0 | 0.014 | 0.013 | 0.009 |
06/2019 | 0 | 0.007 | 0.006 | 0.011 |
07/2019 | 0 | 0.004 | 0.005 | 0.016 |
08/2019 | 0 | 0.011 | 0.012 | 0.012 |
09/2019 | 0 | 0.009 | 0.008 | 0.008 |
10/2019 | 0.008 | 0.014 | 0.013 | 0.020 |
11/2019 | 0 | 0.017 | 0.010 | 0.008 |
12/2019 | 0 | 0.020 | 0.015 | 0.016 |
01/2020 | 0 | 0.005 | 0.006 | 0.007 |
02/2020 | 0 | 0.010 | 0.007 | 0.003 |
03/2020 | 0 | 0.011 | 0.005 | 0.005 |
04/2020 | 0 | 0.018 | 0.012 | 0.010 |
ARIMA – autoregressive integrated moving average; LS-SVM – least square support-vector machines
In order to verify the predictive effectiveness of the combined model, this study adopted the same data to predict the ARIMA model, the LS-SVM model and the LS-SVM-ARIMA model. The LS-SVM model and the LS-SVM ARIMA model were optimised by the PSO algorithm. Table 6 and Fig.10 show that the LS-SVM-ARIMA combined model based on WPD had the highest degree of data fitting.
Comparisons of predicted values from each model with the true values
LS-SVM – least square support-vector machines; WPD – wavelet packet decomposition; ARIMA – autoregressive integrated moving average
In this section, the mean absolute error (MAE) and root mean square error (RMSE) of the true and predicted values were employed as evaluation indices of the prediction effectiveness of each model. The MAE and RMSE formulas are as follows:
In both formulas,
The values in Table 7 show that the MAE and RMSE of predictions from the LS-SVM-ARIMA combined model based on WPD were the smallest, which indicated that neither the single ARIMA model nor the LS-SVM model was likely to capture the linear and nonlinear characteristics in the data of avian influenza virus detections. Although the LS-SVM-ARIMA model had the advantages of both the ARIMA model and the LS-SVM model, the LS-SVM-ARIMA combined model based on WPD could more comprehensively extract the detailed information contained in the time series, so it better reflected the law of the positive rate change with time, and predicted the positive rate more accurately.
Evaluation of the prediction results of the different models
Evaluation index | LS-SVM-ARIMA combined model based on WPD | ARIMA model | LS-SVM model | LS-SVM-model ARIMA |
---|---|---|---|---|
MAE | 0.024 | 0.119 | 0.098 | 0.077 |
RMSE | 0.020 | 0.067 | 0.059 | 0.041 |
LS-SVM – least square support-vector machines; ARIMA – autoregressive integrated moving average; WPD – wavelet packet decomposition
The LS-SVM-ARIMA combined model based on WPD was a beneficial enhancement for the more accurate prediction of the positivity rate for H7N9 avian influenza virus. However, the prediction results of LS-SVM-ARIMA based on WPD model were not fully consistent with the real values. This difference arises because although the monitoring data of positivity rates are applicable to predicting the presence of avian influenza virus, they are limited by the number of monitoring sites and the effectiveness of sample collection nationwide. In addition, the virus-positive rate is cross-influenced by many factors, so it is difficult for a single data source to reflect the full complexity of the epidemic facts. Astill
According to the monitoring data from April 2013 to April 2020, there was a greater risk of epidemics in late spring and early summer. After livestock at risk was comprehensively immunised against H7N9, the positive rates of infection with the virus were significantly reduced. Except for 2015 and December of each year, the positive samples did not have spatial clustering and were randomly distributed. Therefore, it is recommended that live poultry markets be closed in late spring and early summer, comprehensive animal H7N9 immunisation be continued, and monitoring programs strengthened. Although the tested positivity rate of the virus during 2020 was lower than in recent years, H7N9 avian influenza may still threaten the poultry industry in China. Even with the measures of market closure, immunisation, and vigilant monitoring, H7N9 will not cease to be a latent threat and prediction capability is highly desirable for preparedness for this threat. Therefore, our epidemiological survey proposed a combined LS-SVM-ARIMA model based on WPD to improve predictive accuracy. The MAE of 2.4% and RMSE of 2.0% prove the validity of this new prediction method.