Open Access

Estimation of zooplankton density with artificial neural networks (a new statistical approach) method, Elazığ-Türkiye

   | Dec 31, 2023

Cite

Introduction

Plankton are considered to be the most important component of life on earth. They are very sensitive to changes in pH level, salinity, temperature, and nutrient concentration etc. These are generally small-sized organisms with a short life cycle and a strong susceptibility to environmental conditions. Zooplankton affect the productivity of the system as they cause grazing and nutrient cycling in the systems (Banse 1995). In addition, if the zooplankton density is high, it can also be associated with high fish concentrations (Maravelias & Reid 1997; Aoki & Komatsu 1997).

Zooplankton play a wide range of essential roles in an aquatic ecosystem. A varied assemblage of zooplankton in an aquatic community is usually an indicator of its health. They can serve as an indicator of eutrophication (Attayde & Bozelli 1998; Sousa et al. 2008). Zooplankton are essential in transferring nutrients and energy between the autotrophs and higher trophic levels. Certain zooplankton groups also respond to changes in the environment as well as the physico-chemical attributes of the system, in addition to showing diel patterns, including vertical migration (Dini & Carpenter 1992). The physicochemical conditions of the environment affect the abundance, presence, and distribution of organisms in a habitat. The organism has ecological niches in which some environmental features are characterized by special forms to survive in its habitat (Amoros et al. 1987).

From a biological viewpoint, patterns of species existence and the abundance of species usually show non-linear complexities in their relations with the habitat spatial heterogeneity and interactions with other species. For these reasons, artificial neural networks (ANNs) can be an attractive alternative as a tool for analyzing and modeling ecological data, since they can take account of specific factors such as non-linearity, adaptation and generalization (Schleiter et al. 1999). Artificial neural networks can learn and generalize by experimenting with data. Therefore, it has a non-linear structure. It has also been found that it gives better results when compared to linear methods (Sharda & Patil 1992). In this context, the method can detect nonlinear relationships without the need for any assumptions (Kaastra & Boyd 1996). It also allows the use of an unlimited number of variables.

In aquatic ecology, construction of a model of zooplankton behaviour is particularly important because of its enormous ecological relevance. Zooplankton, which are the main consumers of phytoplankton, provide the link between the lower levels of the food chain and the fish, birds, and mammals at the upper levels. Therefore, this is determined in zooplankton abundance or species composition. These changes are indicative of important changes affecting primary production (primary productivity) in the aquatic habitat (Pınto-Coelho 1998). In aquatic ecology, ANNs are widely used to detect algae growth and relationships between environmental variables, macro-invertebrates, and fish (Mastrorillo et al. 1997; Reyjol et al. 2001; Olden & Jackson 2001; Hoang et al. 2001).

However, construction of a model of zooplankton behaviour is particularly important because of their enormous ecological relevance. Zooplankton occupy an intermediary position in the food chain and play a part in many ecological processes, such as energy flow and nutrient cycling (Pınto-Coelho 1998) and also act directly on “bottom-up” and “top-down” mechanisms, thereby promoting changes in the environmental trophic structure (Carpenter et al. 1985). The focus of this present study is to develop modeling for the predictive use of zooplankton dynamics by ANNs, to determine (rotifera, cladocera, copepoda) their density using some environmental variables to reveal the stability of the processes that relate environmental variables to zooplankton. In order to achieve these goals, the data of the Cip reservoir was used. In this study, we investigated whether some water quality parameters would be successful in predicting zooplankton density.

Materials and methods
Study site

Cip reservoir was built in 1965 for irrigation purposes on the Cip Stream in Elazığ. The body volume of the its, which is an earth body fill type, is 446,000 m3, and its height from the stream bed is 23 meters. The reservoir volume is 7 hm3 at normal water level, and the reservoir area is 1.10 km2 at normal water level. It provides irrigation to an area of 1,100 hectares. The plankton samples were collected monthly from Cip Reservoir in 2021- 2022, using a standard plankton net (55-μm mesh size) from three stations (Figure 1). The samples were fixed in 4% formalin, analysed under an inverted microscope (GMBH D-6330 diavert inverted microscope, Earnst Leitz Ltd., Canada) and identified under a compound microscope (Nikon Eclipse E 100, Nikon Instruments Inc., Japan). Water temperature, dissolved oxygen, pH, electrical conductivity and secchi disk were measured in-situ with the YSI professional plus brand meter. Alkalinity was also determined by the titrimetric method. Total phosphorus and total nitrogen were determined by analyses using Merck test kits (spectrophotometric) and the Water and Wastewater Analysis Photometer Merck Spectroquant Nova 60 A. Counting of zooplankton species was done in petri dishes with 5-ml sub-samples. A minimum of 200 individuals were quantified per replicate, and the final density was converted to individuals per cubic metre. Monthly changes of total zooplankton at stations were recorded.

Figure 1

Coordinates of Cip Reservoir

Zooplankton community composition

Zooplanktonic organisms are a food source for invertebrates, fish, and birds. Nearly all freshwater fish are planktivorous in their early life. Planktivorous fish feed both small zooplankton and large phytoplankton (Horne & Goldman 1994). Factors affecting the horizontal and vertical distribution of zooplankton in water are the physical and chemical properties of water, wind, currents, streams entering the lake, depth, season, heat, light, nutrients and predators. When the light is high, the plankton descend from the surface and rise to the surface when the light is low. Since they cause the viscosity and density of water to change with temperature, living things are the best for themselves. Zooplankton gravitate towards the layer with the appropriate temperature (Tanyolaç 2009). Zooplankton density can be mentioned as an indicator of healthy water quality (Karjalainen et al. 1996; Moss et al. 1997; Muylaert et al. 2006).

This is connected with the transfer of energy from zooplankton producers, which represent the second trophic level in the food web, to heterotrophs with higher trophic levels. (Deivanai et al. 2004; Ismail & Mohd Adnan 2016). They respond quickly to physical and chemical changes in their environment. Previous studies have shown that different groups of zooplankton are good indicators of eutrophication: Attayde & Bozelli (1998); Burns & Galbraith (2007); Pinel-Alloul et al. (1990); Sousa et al (2008); Saler (2017); Bulut and Saler (2018); Bulut and Saler (2020); Bulut and Saler (2019). Studies of the effects of environmental factors and the density of zooplankton taxa provide information about the functioning of water systems (Bulut & Saler 2020).

Determination of input variables

Zooplankton are very sensitive to changes in their environment (Legendre & Demers 1984). Therefore, some important predictive parameters that directly or indirectly affect the zooplankton habitat in the Cip reservoir were selected. The factors that directly affect zooplankton in their habitat are water temperature and dissolved oxygen. In some seasons, due to the increase in flooding, water dilution, changes in nutrient and oxygen availability occur, and accordingly, the reproduction and metabolic rate of zooplankton are affected (Loverde et al. 2009). Electrical conductivity and pH, which are measures of production and decomposition processes, are variables that indirectly affect zooplankton density.

Artifical neural networks

ANNs methodology, which provides significant advantages thanks to many features, is widely used in the field of predictive modeling as in other fields. Artificial Neural Networks (ANNs) are artificial information processing models, created by imitating the work of the human brain and taking advantage of the physiology of the brain. ANNs are some of the most successful new approaches in solving problems in recent years (Haykin 1994; Sagıroglu et al. 2003).

The learning feature of ANNs is one of the most important features that attract the attention of researchers because the ability to produce solutions for events that have never been seen before by learning the relationship between inputs and outputs about any event, whether linear or not, from existing examples, which forms the basis of intelligent behaviour in ANNs. A neural network stores information, makes it useful and consists of simple units. It is a parallel distributed processor. Artificial neurons are simply clustered in ANNs. This clustering is done in layers, and then these layers are related to one another. Basically, all neural networks have a similar structure. Some neurons are connected to the outside to receive inputs and some neurons to transmit outputs. All the remaining neurons are in the hidden layers, that is, they only have connections within the network (Anderson & McNeill 1992).

MATLAB’s Neural Network Toolbox (Ver R2016a) was used for ANNs. ANNs created in MATLAB software consist of three parts. These are “training”, “testing” and “validation”. The proposed model is divided into three layers: input, output and hidden layers. In addition, the model consisting of 31 neurons (8 in the input layer, 3 in the output layer and 20 in the hidden layer) is designed as fully connected feed-forward-feedback. The model structure of the system is shown in Figure 2.

Figure 2

Model Structure of ANNs

In forward propagation, the temperature, pH, dissolved oxygen, conductivity, secchi disk, alkalinity, total nitrogen, total phosphorus data received in the input layer are taken from the input layer and transmitted to the hidden layer by passing through the activation function. In the hidden layer, new values from the previous layer are reactivated and transmitted to the output layer. Error rates are calculated between the targeted values and the new values obtained in the output layer. If the calculated error rate is greater than 1e-7, the back propagation algorithm is run. If the errors are reflected in the weight values in the hidden layer and input layers, new weight values are created. Weight values and bias values are updated according to Equation 1 and Equation 2. The mathematical equation of the neuron model is as follows (Eq. 3) (Krenker et al. 2011): (yi (k) is the output value at discrete time k; F is a transfer function; wi(k) is the weight value at discrete time k, where i goes from 0 to m; xi(k) is where i goes from 0 to m input value at discrete time k, b bias). MAPE was used to compare ANNs and other methods. The smaller the MAPE values, the closer the estimated values to the true values (Benzer et al. 2017). MAPE is as in the following equation (Eq. 4). Yi is the actual observation value; ei is the difference between the true value and the predicted value; n is the total number of observations. w:wεCw $$w:w - \varepsilon {{\partial C} \over {\partial w}}$$ b:bεCb $$b:b - \varepsilon {{\partial C} \over {\partial b}}$$ Y(k)=Fi=0mwi(k)xi(k)+b $$Y(k) = F\left[ {\sum\nolimits_{i = 0}^m {wi(k) \cdot xi(k) + b} } \right]$$ MAPE =1ni=1neiYi×100 $$MAPE{\rm{\;}} = {1 \over n}\sum\nolimits_{i = 1}^n {\left| {{{ei} \over {Yi}}} \right| \times 100 \ldots \ldots \ldots } $$

Results
Some Water Quality Parameters of The Study Field

In the study, some water quality parameters such as water temperature, pH, dissolved oxygen, electrical conductivity, secchi disk, alkalinity, total nitrogen, and total phosphorus were measured during the sampling in the field. Accordingly, it was determined that the highest water temperature was 27.8°C in summer at the 2nd station, and the lowest was 3.2°C in the winter at the 2nd station. The highest pH was 8.5 in winter; the lowest 7.4 in summer at the second station. The maximum dissolved oxygen concentration was 8.8 mg l-1 at the 1st station in winter, while the lowest concentration was 6.1 (mg l-1) at the 2nd station. Conductivity in the maximum reading was 536 (μS cm-1) in summer at the 3rd station; the lowest reading was 305 (μS cm-1) in winter at the 2nd station. The highest secchi disk was 48.3 (cm) in summer at the 1st station; the lowest was 38.2 (cm) in winter at the 3rd station. The alkalinity was 320 (mg CaCo3 l-1) in summer, while the lowest was 120 (mg CaCo3 l-1) in autumn in the 3rd station. The highest total nitrogen was 3.4 (mg N l-1) in summer at the 1st station; the lowest was 0.2 (mg N l-1) in winter at the 3rd station. The maximum total phosphorus was 1.6 mg l-1 at the 1st station in summer, while the lowest value was 0.01 (mg l-1) at the 3rd station in winter (Figure 3).

Figure 3

Some water quality parameters of study field

Zooplankton structure also showed temporal changes. On average, rotifers were the major component in all the periods studied; cladocerans and copepods were determined during the seasons (Figure 4).

Figure 4

Monthly changes of total zooplankton in Cip Reservoir at the 1st, 2nd, and 3rd station

Actual values of the 1st, 2nd and 3rd station zooplankton density (rotifera, cladocera, copepoda) and artificial neural networks values are given according to the months in Table 1, Table 2, and Table 3. The actual values of zooplankton density and the results obtained from the artificial neural networks were compared. These values were calculated one by one. Mean absolute percent error (MAPE) values were calculated with actual values and ANNs values. ANNs values were determined to be close to the real data. When Table 1 is examined, MAPE (%) value was determined as 1.143 for Rotifer, 0.118 for cladocera, and 0.141 for copepoda. When Table 2 is examined, MAPE (%) value was determined as 0.941 for Rotifer, 0.377 for cladocera, and 0.185 for copepoda. When Table 3 is examined, MAPE (%) value was determined as 0.342 for Rotifer, 0.557 for cladocera, and 0.301 for copepoda.

Comparison with artificial neural networks of real values of zooplankton density for the 1st station

Month rotifer cladocera copepeda
Real Data ANNs MAPE (%) Real Data ANNs MAPE (%) Real Data ANNs MAPE (%)
September 68273 68272 0.001 509 509 0.000 0 0.002 0.000
October 50950 49886 2.088 4585 4585 0.000 2038 2038 0.000
November 35667 34730 2.627 1019 1019 0.000 5605 5604 0.018
December 10190 10188 0.019 0 2.262 0.000 0 0.002 0.000
January 3973 3973 0.000 0 0.061 0.000 509 508 0.196
February 17323 17316 0.040 0 0.037 0.000 1528 1528 0.000
March 30061 30056 0.016 0 0.688 0.000 0 1.837 0.000
April 27515 28315 2.907 1019 1018 0.098 0 0.000 0.000
May 43817 43817 0.000 9681 9680 0.010 203 206 1.478
June 58533 58512 0.036 17833 17832 0.006 0 0.000 0.000
July 17324 18359 5.974 23948 23947 0.004 0 1.939 0.000
August 8662 8663 0.012 2548 2581 1.295 0 0.182 0.000
Average MAPE (%) 1.143 0.118 0.141

Comparison with artificial neural networks of real values of zooplankton density for the 2nd station

Month rotifer cladocera copepeda
Real Data ANNs MAPE (%) Real Data ANNs MAPE (%) Real Data ANNs MAPE (%)
September 33113 32992 0.365 509 506 0.589 0 0.000 0.000
October 27510 27436 0.269 4585 4526 1.287 3057 3057 0.000
November 15285 15550 1.733 1019 1019 0.000 3057 3057 0.000
December 9681 9622 0.609 0 23.012 0.000 410 410 0.000
January 3260 3298 1.166 0 17.785 0.000 509 509 0.000
February 5605 5596 0.161 0 62.318 0.000 9171 9196 0.273
March 5604 5521 1.481 509 507 0.393 1224 1224 0.000
April 3057 3056 0.033 3057 3057 0.000 0 0.001 0.000
May 8151 8364 2.613 7132 7047 1.192 0 0.000 0.000
June 14776 14773 0.020 12942 12917 0.193 205 209 1.951
July 4076 4176 2.453 12738 12627 0.871 0 23.51 0.000
August 1019 1015 0.393 0 12588 0.000 509 509 0.000
Average MAPE (%) 0.941 0.377 0.185

Comparison with artificial neural networks of real values of zooplankton density for the 3rd station

Month rotifer cladocera copepeda
Real Data ANNs MAPE (%) Real Data ANNs MAPE (%) Real Data ANNs MAPE (%)
September 28532 28580 0.168 203 208 2.463 0 0.043 0.000
October 31080 31081 0.003 3567 3596 0.813 1019 1018 0.098
November 15489 15536 0.303 509 509 0.000 4076 4025 1.251
December 13247 13218 0.219 203 206 1.478 509 508 0.196
January 3770 3774 0.106 0 0.958 0.000 509 509 0.000
February 16814 16839 0.149 509 501 1.572 203 204 0.493
March 27543 27169 1.358 509 508 0.196 509 509 0.000
April 23438 23439 0.004 1019 1018 0.098 203 201 0.985
May 36175 36132 0.119 8662 8661 0.011 1019 1018 0.098
June 37705 37665 0.106 7134 7133 0.014 203 203 0.000
July 12738 12608 1.020 11210 11209 0.009 0 0.228 0.000
August 9681 9631 0.516 3567 3566 0.028 203 202 0.493
Average MAPE (%) 0.342 0.557 0.301

ANNs results were obtained using training data and artificial neural networks. ANNs results regarding zooplankton densities for the 1st station, 2nd station, and 3rd station are given in Figure 5, Figure 6, and Figure 7. The best fit between targets and outputs is determined by the linear regression line. The R value determines the relationship between these targets and outputs. The targeted output R value is calculated as 0.99716 for training, 0.92436 for validation, 0.99642 for testing, and 0.94196 for all in the 1st station. The targeted output R value is calculated as 0.95697 for training, 0.99126 for validation, 0.52772 for testing, and 0.95631 for all in the 2nd station. The targeted output R value is calculated as 0.91557 for training, 0.93203 for validation, 0.98536 for testing, and 0.92585 for all in the 3rd station. R values close to 1 indicate the best results of the training (Figure 5-7).

Figure 5

Training, validation, testing and all data results of artificial neural networks for the 1st station

Figure 6

Training, validation, testing and all data results of artificial neural networks for the 2nd station

Figure 7

Training, validation, testing and all data results of artificial neural networks for the 3rd station

Figures (8-10) show the corresponding validation checks and the gradient of epochs. For the training state of artificial neural networks model, the validation checks were attained as 6, at epoch 6 and gradient = 1668541.3655, and at epoch 10 in the 1st station. For the training state of artificial neural networks model, the validation checks were attained as 6, at epoch 6 and gradient = 455785.4862, at epoch 6 in the 2nd station. For the training state of artificial neural networks model, the validation checks was attained as 6, at epoch 6 and gradient = 3483291.2888, and at epoch 6 in the 3rd station.

Figure 8

Artificial neural networks training state at the 1st station

Figure 9

Artificial neural networks training state at the 2nd station

Figure 10

Artificial neural networks training state at the 3rd station

Discussion

Densities of zooplankton groups are determined by the limnological conditions in the month in which they are found; therefore, low zooplankton density in a given month is of little importance for prediction in the following month. Due to organisms having short life cycles, environmental changes affect their abundance in the current state of freshwater (Legendre & Demers 1984). The present study zooplankton density was listed as rotifera, cladocera and copepoda. Saler reported that the rotifera was concentrated (Saler 1995). Zooplankton density is known to determine phytoplankton density (Ryding & Rast 1989). Therefore, determining the total density of zooplankton, which is believed to have the potential to affect phytoplankton growth, increases the importance of this study.

ANNs models are based on monthly data of species and environmental variables. The amount of data is larger in the original data set and computing is time consuming. The dimension reduction techniques provided improved generalisation performance of the ANNs in many, but not all, cases. Selection of appropriate preprocessing methods is necessary for the achievement of neural modeling (Schleiter et al. 1999). The use of ANNs in ecology is limited to cases where there is a large amount of data and sufficient data to be allocated for model validation (Aguilar Ibarra et al. 2003). The predictive ability of the models tested here can be considered suitable for estimating the densities of zooplankton taxa density samples, showing the limnological status of the current reservoir, even with a small data series for network training, suggesting more applications for ANNs. The ANNs model was found to be suitable for estimating the monthly dynamics of zooplankton groups even in short series.

Karul et al (2000) reported that in addition to Chlorophyll-a concentrations, neural network models can also be used to predict the densities of certain species as functions of environmental parameters.

Evaluation of the estimation results were obtained from the methods applied in previous studies on artificial neural networks. As a result, it provides the highest prediction accuracy and the most accurate values. It has been seen that the method that gives close results is “Artificial Neural Networks” (Benzer & Benzer 2018; Ozcan & Serdar 2018; Ozcan & Serdar 2019; Ozcan 2019).

If models with different unit values are to be compared, MAPE statistics are used to eliminate the disadvantages that may occur. MAPE is considered to be more successful than other criteria as it reveals the estimation errors as a percentage among the criteria and therefore makes sense on its own. After completing learning, it was observed that, at the end of the test, we met the values we expected and the values predicted by the model, and these values were very close to each other.

Forecast models below MAPE < 10% were classified as having a “high accuracy” rating, models with 10% < MAPE < 20% were classified as correct prediction models, and models above 50% < MAPE were classified as “false and faulty” (Lewis 1982). As mentioned before, models with a MAPE value below 10% are classified as “high accuracy” in the research literature. When Table 1 is examined, MAPE (%) value was determined as 1.143 for Rotifer, 0.118 for cladocera, and 0.141 for copepoda. When Table 2 is examined, MAPE (%) value was determined as 0.941 for Rotifer, 0.377 for cladocera, and 0.185 for copepoda. When Table 3 is examined, MAPE (%) value was determined as 0.342 for Rotifer, 0.557 for cladocera, and 0.301 for copepoda. In this case, it can be said that the applied method produces very successful estimations.

The use of ANNs in ecology is limited to cases where there is a large amount of data and sufficient data to be allocated for model validation (Aguilar Ibarra et al. 2003). The predictive ability of the models tested here can be considered suitable for estimating the densities of zooplankton taxa density samples, showing the limnological status of the current reservoir, even with a small data series for network training, suggesting more applications for ANNs. The ANNs model was found to be suitable for estimating the monthly dynamics of zooplankton groups even in short series. Zooplankton density is known to control phytoplankton density (Ryding & Rast 1989). Therefore, determining the total density of zooplankton, which is believed to have the potential to affect phytoplankton growth, increases the importance of this study.

Conclusion

A neural network model can predict values that are outside the bounds of the training set, that is, that have never been introduced to the system before. Instead of creating a time series, the model does not include a time component due to the nature of the data distributed over time and space and determines the current state of the reservoirs. In summary, with this study, it has been seen that artificial neural networks are successful in predicting zooplankton densities in an aquatic environment thanks to their learning feature.

eISSN:
1897-3191
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Chemistry, other, Geosciences, Life Sciences