The impact of selected risk factors on the occurrence of highly pathogenic avian influenza in commercial poultry flocks in Poland
Published Online: Jan 29, 2021
Page range: 45 - 52
Received: Aug 05, 2020
Accepted: Feb 03, 2021
DOI: https://doi.org/10.2478/jvetres-2021-0013
Keywords
© 2021 A. Gierak, K. Śmietanka, published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.
Highly pathogenic avian influenza (HPAI) is an infectious viral disease affecting wild and domestic bird species worldwide (4). The HPAI viruses (HPAIV) emerge from low-pathogenic precursor viruses (low-pathogenic avian influenza viruses, LPAIV) upon transmission from wild aquatic birds and circulation in poultry (3, 4, 5). HPAI occurrence in a country brings serious economic consequences and invokes temporary suspension of poultry trade (8). Therefore, confirmation of the disease is subject to notification and immediately triggers countermeasures to prevent its further spread (7).
In 1996, the Guangdong lineage of H5 HPAIV (H5 Gs/Gd) emerged in China and subsequently evolved into multiple genetic clades and genotypes as a result of genetic drift and reassortment (15, 23). Since 2008, a specific clade 2.3.4 of H5 Gs/Gd HPAI viruses has undergone frequent reassortments with LPAIV of wild-bird origin and the resulting novel viruses, collectively known as “H5Nx” (H5N2, H5N5, H5N6 and H5N8 as differentiated by neuraminidase subtype), show increased adaptation to wild aquatic birds and have spread to other continents, including Europe, Africa and North America (15). In recent years, Europe has experienced repeated outbreaks of HPAIV H5Nx Gs/Gd clade 2.3.4.4 lineage as a result of new introduction of the virus from southeast Asia or Africa, re-emergence of the reassorted virus from the previous epidemic, or the continued endemic circulation of HPAIV H5 (1, 19, 25, 27). In Poland, epidemics of HPAI in poultry have occurred three times: in 2007 (caused by HPAIV H5N1 clade 2.2), in 2016–2017 (caused by HPAIV H5N8 and H5N5 clade 2.3.4.4) and in 2019–2020 (caused by HPAIV H5N8 clade 2.3.4.4) (24, 25, 26).
In the face of the changing HPAI epidemiological characteristics and the increasingly substantial role of wild birds in the spread of HPAIV into a new area (5), there is a need for continuous improvement of emergency preparedness. According to European Union (EU) legislation, Member States must identify risk areas on their territory where there are multiple facilitators of introduction of HPAIV into poultry holdings (6). Relevant risk factors for the introduction of HPAIV into poultry flocks include proximity to wetlands such as swamps and to bodies of water such as ponds, lakes, rivers or the sea where migratory birds, in particular waterfowl and shorebirds, may gather at stop-over sites; poultry holdings being in locations through which migratory birds travel or at which they rest during their movements along the north-eastern and eastern migratory routes into the EU; and keeping poultry in free-range systems where contact between wild birds and poultry cannot be prevented. In turn, the risk factors for the spread of HPAIV between holdings include holdings being in locations with a high density of poultry farming, particularly operations with outdoor access (ducks, geese or free-range layers); frequent movements of vehicles transporting poultry and of persons within and from holdings; and other direct and indirect contacts between holdings being common (6).
The aim of the study was to evaluate the impact of the assumed risk factors, for which quantitative data are available, on HPAI occurrence in commercial Polish flocks during the epidemics which have occurred in recent years. The findings may lead to the improvement of control strategies by fine-tuning risk-based surveillance and advocating for reinforced biosecurity on farms at higher risk of infection.
Disease-related data,
Fig. 1
Locations of affected and unaffected commercial farms. The map was produced using the

The spatial distribution of each risk factor in Poland was expressed as a separate raster layer with a raster cell of 250 × 250 m, using the appropriate package (raster: Geographic Data Analysis and Modeling, R package version 3.4-5) in R software (version 3.6.1) (21). The input data used to develop the layers include the densities of the four poultry species in each commune, the distance from water bodies and the distance from sites of high wild bird concentration all over the country. Next, the values of each risk factor at the locations of both affected and unaffected farms were extracted. The analysis was developed on the basis of these values. The distributions of the values and the raster layers depicting the spatial distributions of the risk factors in Poland are presented in Table 1.
The list of preliminary selected risk factors with their spatial distribution in Poland and the distribution of values in experimental (from affected farms) and control (from unaffected farms) groups. The maps and graphs were produced using
Firstly, the entire dataset was split into a training set (70% of the observations) and a testing set (30% of the observations). To predict the probability of HPAI occurrence based on values of predictors (risk factors), three approaches were applied to the training dataset.
Firstly, the logistic regression model of the form
was used, where
Secondly, the classification tree model was applied to the dataset of the form
Finally, the random forest model was used. Different decision trees were formed
The importance of predictors was estimated using the mean decrease of Gini index in the decision tree and the random forest model using the
Based on the six raster layers representing the risk factors, the probability of HPAI occurrence was calculated in each raster cell with a size of 250 × 250 m. For each method, the risk map depicting the spatial distribution of the probability within the country was created using the
Predictive accuracy of the models was evaluated using the testing dataset. The actual status of each farm (affected or unaffected) was compared with their predicted status, using different cut-off values of probability. The performance of each model was described using the value of accuracy being the number of correctly classified observations divided by the total number of observations. Based on the misclassification matrix, the optimal cut-off value of probability was determined in each model.
The results of the logistic regression model indicate that the relevant variables (risk factors) include log-transformation of density of turkeys, geese, ducks and chickens. The densities of these species, except chickens, are positively associated with the probability of HPAI occurrence. The detailed statistics for these variables are presented in Table 2.
Results of logistic regression model
Notation | Description of the variable | Estimation | 95% Confidence interval | P value |
---|---|---|---|---|
log-transformed density of turkeys | 0.973 | [0.634, 1.355] | 9.94e−08 | |
log-transformed density of geese | 0.409 | [0.113, 0.709] | 6.92e−03 | |
log-transformed density of ducks | 0.447 | [0.108, 0.791] | 9.8e−03 | |
log-transformed density of chickens | −0.193 | [−0.375,−0.018] | 3.31e−02 |
The form of the final regression model is:
Logarithm transformation was used for all predictors. The maximum VIF was lower than 1.87, which indicates the non-existence of high correlations between them. The assumptions of a linear relationship between the log odds of the target variable and the predictors were not fully met, because their distributions were concentrated in one point.
Based on the results of the classification tree model, the areas with both goose density greater than or equal to 42 and turkey density greater than or equal to 95 are at the highest risk. An 8% share of the observations belonged to these areas, among which HPAIV was confirmed in 91%. The areas with goose density lower than 42 are at low risk. A 79% of the observations belonged to low-risk areas, among which HPAIV was confirmed in 7%. The detailed predictions of the values of the target variable using the risk factors are presented in Fig. 2.
Fig. 2
Results of decision tree model

In the decision tree model, the importance values of the risk factors were 25.54, 21.93, 19.08, 12.19, 3.73 and 0 for densities of turkeys, geese, and ducks, proximity to areas with high concentrations of wild birds, proximity to water bodies and chicken density, respectively, which indicates that the predictions of the model were mainly based on the turkey density, and to a lesser extent, goose and duck densities, proximity to areas with high concentrations of wild birds and proximity to water bodies. The predicted values of probabilities were not affected by the density of chickens. In the random forest model, the importance values of the risk factors were 18.53, 17.79, 15.01, 10.4, 9.79 and 5.93 for densities of turkeys and geese, proximity to areas with high concentrations of wild birds, duck density, proximity to water bodies and chicken density, respectively, which indicates that the predictions of the model were mainly based on the turkey density, and to a lesser extent, goose density, proximity to areas with high concentrations of wild birds, duck density and proximity to water bodies. The prediction of the model was based on chicken density to the least extent. The scaled importance values of the risk factors designated by the decision tree and random forest models are presented in Fig. 3.
Fig. 3
The importance of considered risk factors designated by the decision tree (red dots) and random forest (blue dots) models. WB – wild birds. The importance values were scaled to 0 and 100 (values of 100 and 0 indicated the most important and the least important risk factor in the model, respectively)

The impact of changes in the values of the selected variable on the output value of probability for fixed median values of the other variables is presented in Fig. 4.
Fig. 4
Impact of changes in the selected variables on the predicted value of probability of HPAI occurrence using different models: logistic regression (black line), decision tree (green line), and random forest (red line). WB – wild birds

The first, second and third models attained predictive accuracy of 0.8667 (95% CI (0.7925, 0.9218)), 0.875 (95% CI (0.8022, 0.9283)) and 0.8917 (95% CI (0.8219, 0.941)), respectively. Sensitivity of 80% was achieved for the cut-off values of 0.05 and 0.12 for the logistic regression and random forest models, respectively. Their corresponding specificities were 52% and 78%, respectively. Sensitivity of 80% was not achieved for the decision tree model.
The probability of HPAI occurrence is not equally distributed across the country. Several areas of central, western, eastern and northern Poland are at the highest risk. The spatial distribution of the values of probabilities is presented in Fig. 5.
Fig. 5
Spatial distribution of the predicted values of the probabilities using different models: a) logistic regression, b) decision tree, and c) random forest

The results of each model indicated that the densities of turkeys and geese are the most important risk factors. In each model, a slightly different association between the exact number of these poultry species in an area and the values of the probability of HPAI occurrence was observed (Fig. 4). The other relevant parameter was the density of ducks. Proximity to areas with a high concentration of wild birds was found to be a more important risk factor than proximity to water bodies. Nevertheless, no statistically significant importance of these two parameters was confirmed in the logistic regression model. The presence of chickens in an area hardly had any impact on the value of probability in either the classification tree or random forest models. The logistic regression model indicated, in turn, a negative association between chicken density and the value of the probability (Fig. 4).
The results of our study confirm the impact of the acknowledged risk factors on the occurrence of HPAI outbreaks to a large extent. Similarly to other studies’ findings (11), the presence of waterfowl increases the risk of HPAI occurrence due to outdoor access and facilitated contact with wild birds. The proximities to sites with high concentrations of wild birds and to water bodies where migratory birds tend to concentrate were previously demonstrated to be key risk factors for the introduction and further spread of HPAIV H5N1 (28). However, our results did not confirm such a strong impact of these factors on the HPAI occurrence during the recently reported epidemics. It may be related to the more stringent biosecurity and better prevention of contact between wild birds and poultry on commercial farms than on the varying farm types in different countries. On the other hand, turkey density was found to be the most important risk factor. This may be related to the high susceptibility of this species to HPAIV infection. It was shown experimentally that turkeys are >100-fold more susceptible to infection with HPAIV H5N1 and H7N1 subtypes than chickens (2). Additionally, it was also demonstrated that HPAIV H5N8 clade 2.3.4.4 can be successfully transmitted from ducks to turkeys followed by efficient onward transmission among turkeys (20, 22).
Interestingly, the presence of commercially farmed chickens in an area did not increase the probability of HPAI outbreak occurrence. The logistic regression model suggested its negative and the other two models its negligible impact. This effect was previously observed also by other authors who studied risk factors for HPAI occurrence in France, China and Indonesia (9, 12, 29). This phenomenon can be explained by three reasons. First, chickens seem to have lower susceptibility to HPAIV infection than turkeys or ducks,
Two types of model – generalized linear and nonlinear – were used to assess the impact of selected risk factors on the probability of HPAI occurrence. The logistic regression approach was frequently used to indicate relevant risk factors in other countries (16, 17); however, the analysis of the input data suggests the validity of the application of non-linear approaches. Comparing the accuracies of the models, the random forest outperforms the logistic regression in terms of predictive power, in which case 80% sensitivity with 78% specificity was attained. It may be explained by non-linear associations or the presence of interactions between risk factors. Comparing the two non-linear models in turn, the random forest was much more accurate than the decision tree. This is observed very often, because the former method combines the output of multiple (randomly created) decision trees to generate the final output. As the decision tree model’s results are simpler to interpret than those of the random forest model, they were presented only for visualisation of the impact of particular risk factors, but not as an alternative or competitive method. In view of this, the random forest approach should be the method of choice to predict the areas at increased risk of HPAI occurrence in Poland.
None of the models allowed very high (>80%) sensitivity to be obtained with acceptable specificity. A 20% share of observations from the testing dataset was incorrectly classified – the predicted status of farms (affected or unaffected) did not reflect their actual status, even using a low cut-off value. This may suggest the impact of other risk factors than those included in our study, and this aspect should be further investigated. Nevertheless, the required data are often missing or their expression in quantitative terms is not always feasible. For instance, as indicated in other studies, the association of HPAI occurrence with anthropogenic risk factors is undeniable because virus spread by personnel and fomites is known to be possible (16, 18). The impact of these risk factors was previously estimated by population density in the county around the farm or in the area adjacent to the farm. However, our study focuses only on commercial poultry farming where any visits to the premises should be restricted to a minimum. Therefore, the more relevant risk factor is related to farming practices and biosecurity level on a farm (17). Nevertheless, quantifying the effect of these was also not feasible due to the lack of accurate and fully reliable data.
This is the second spatial analysis developed to predict the areas at risk of HPAI occurrence in commercial flocks in Poland. The first analysis combined knowledge mined from the literature and opinions of Polish experts in the field of epidemiology and poultry diseases (10). Due to the development of the first model preceding the largest HPAI epidemic in Poland, data related to the locations of HPAI outbreaks were unavailable. After more extensive epidemic data became available, the obtained data were only used for the model verifications. Therefore, the first study reflected the arbitrary evaluations of each expert related to the key risk factors and their impact on HPAI occurrence in Poland.
Comparing the results of the study reported herein with those of the pre-epidemic one, it can be noted that the overlapping areas at high risk mainly include parts of the Lubuskie, Lubelskie, Łódzkie, and Wielkopolskie provinces. As indicated by the experts, the common risk factors included density of turkeys, domestic waterfowl and proximity to areas with high concentrations of wild birds. Each of these risk factors was assumed to be positively associated with the areas at risk of HPAI occurrence.
The number of observations (reported outbreaks of HPAI) included in the model was relatively small. Inclusion of new cases would improve the quality of the model. Therefore, if the epidemic reoccurs in the country, the experimental group should be extended. The data related to the densities of different poultry species considered in the analysis should be updated regularly.
The models developed here can be a valuable source of information for different groups of stakeholders, including poultry owners and risk managers. The results can also lead to the improvement of targeted surveillance in Poland.