Accesso libero

Spatial distribution of soil nutrient content for sustainable rice agriculture using geographic information system and Naïve Bayes classifier

INFORMAZIONI SU QUESTO ARTICOLO

Cita

Introduction

The agricultural sector is vital for several developing countries in terms of its role in supporting the economy [1], [2]. That is seen that the role of the agricultural sector is to employ the population, create national income, and contribute to gross domestic product [3], [4]. The agricultural sector also offers additional benefits, such as ensuring the quality and stability of the environment(mitigating floods, controlling soil erosion, maintaining the groundwater supply, sequestering carbon, air conditioning and freshening, organic waste recycling, and maintaining biodiversity), preservation of sociocultural values and rural attractiveness (rural amenity), buffering financial stability, alleviating poverty, and various other services [5], [6]. One of the products of the agricultural sector is rice. Rice is a staple food source in Indonesia. The life cycle of the rice plant depends on water availability because water has a vital role in both human life and the rice plant ecosystem [7], [8], [9]. The quality of the food consumed is influenced by several factors, such as the weather, the content of natural chemical compounds in the food, and the water quality [7], [8], [10], [11]. Rice plants grow excellently in ecosystems having areas with availability of abundant water for irrigation and huge amounts of water vapor [7], [12]. Around 0–1500 mesh is suitable for the rice plant ecosystem [13]. Phosphorus is an essential micronutrient for organisms. Water and soil contain inorganic compounds of phosphorus [14], [15]. Sediment deposition and soil causes phosphate to dissolve in groundwater and seawater [16].

Phosphorus (P) is essential for all life on Earth and, for plants, it is a key element in photosynthesis, respiration, and the biosynthesis of nucleic acids and membranes [17], [18], [19]. It also performs a vital role in regulating many enzymes [20]. As a plant macronutrient, phosphorus frequently limits both natural and agricultural systems [17], [21]. It is a structural and functional component of nucleic acids, membrane lipids, energy metabolites, and activated intermediates within the photosynthetic carbon cycle; moreover, inorganic phosphate (Pi) plays an essential role in signal transduction or cellular response for knowing the nutrient in living organisms [22]. Phosphorus has a principal role in stimulating the growth of plant roots and accelerating the growth of flowers into fruits to speed up the harvest period [23]. However, the role of phosphorus in the soil is influenced by soil pH. Alkaline pH levels cause phosphorus to decrease [24]. For maintaining the condition of the soil, fertilizer is added to the soil to increase the nutrient content and improve crop production and quality [25], [26]. However, it affects the accumulation of phosphorus in the soil during irregular fertilization [27]. There is a limitation in the context of monitoring of the quality and content of soil nutrients, which is still done manually by farmers. Advanced science and technology are needed to obtain a better crop yield [28].

One of the methods to increase agricultural production is applying precision farming patterns, specifically by monitoring and analyzing soil conditions [29], [30], [31]. This research aims to help farmers with fertilizer recommendations with a map of nutrient status based on the geographic information system (GIS) [32]. There are many definitions of geographic information and the systems used to store, retrieve, examine, and show facts that might be represented spatially or geographically [33], [34]. Geographic information can be manipulated or stored using GIS, a computer-based system [35], [36]. In the literature [36], [37], [38], there are numerous reports on the usage of GIS to make decisions regarding land resources with the aim of reaching an acceptable use of land for optimum food production and profit. One of the key benefits of using GIS is its application for soil evaluation. The presentation of effects can be achieved spatially explicitly in the form of maps to show the spatial distribution of geographic features [34], [39]. GIS implementation is expected to assist farmers in monitoring soil phosphorus content and efficient fertilization processes.

One method that can support mapping of the soil phosphorus content of agricultural rice land-based on GIS is the Naïve Bayes algorithm [40], [41], [42]. Previous studies [43], [44] have investigated the use of the Naïve Bayes algorithm. The previous work from [43] also predicted landslides based on selected risk factors with an accuracy of 79.8%, whereas another work [44] investigated decision-making regarding the quality of soil with an accuracy of 87.5%. Hence, this work aims to apply phosphorus mapping to paddy soil using the Naïve Bayes method in combination with GIS. This system uses the TCS3200 sensor [45] on 20 samples of lowland soil, which are tested 200 times in Lendah district, Yogyakarta, Indonesia. The sample-testing results show an error rate of 3% and a success rate of 97%. GIS-based mapping results can be used as monitoring data for evaluating the possible phosphorus content in paddy soil.

Research Methods
Paddy Soil Test Kit (PUTS)

This device is accurate, easy, and relatively fast in analyzing the nutrient content in the soil. Nitrogen, phosphorus, potassium, and soil pH are the primary measurement targets for the PUTS design. The Paddy Soil Test Kit (PUTS) kit consists of several chemicals used for extracting soil nutrients, a leaf color chart (BWD), and instructions for use, along with fertilizer recommendations. The image of a PUTS device is shown in Figure 1.

Figure 1

Paddy Soil Test Kit (PUTS).

Naïve Bayes Classifier Algorithm

Probability and statistics are classification methods found in the Naïve Bayes method. This method was proposed by Thomas Bayes and is used as a reference to predict future opportunities based on previous experience [46], [47]. Naïve Bayes Classifier requires a small amount of training data to achieve classification of the parameters. The variables in this method are assumed to determine each class [42]. Naïve Bayes classification is obtained by using the following expression: P(x)=P(c)P(c)P(x), P(x) = {{P(c)P(c)} \over {P(x)}}, where P(c) = class prior probability; P(x|c) = likelihood; P(x) = predictor prior probability; and P(c|x) = posterior probability.

System Planning

Figure 2 shows the block diagram of this system.

Figure 2

The block diagram of the measurement process of phosphorus using a paddy soil phosphorus meter.

The block diagram in Figure 2 shows the working of the paddy soil phosphorus meter. The initial stage is to get the color of soil phosphorus in the paddy soil samples extracted using PUTS. The Soil phosphorus determination are obtained using the TCS3200 sensor through the serial peripheral interface (SPI) communication line connected to the Wemos D1 Mini board. The resulting data of the TCS3200 sensor are processed on the Wemos D1 Mini board, which is connected to a 5 V voltage source. The data processed on the Wemos board and classified using the Naïve Bayes method are displayed on a 16 × 2 liquid-crystal display (LCD) and then sent to the ceerduad.com website. Wemos is used because it is a microcontroller that has an integrated Internet of Things (IoT) system and a wireless fidelity (WiFi) module with 4 MB of storage memory [48]. The data are displayed on a 16 ཌ 2 LCD, and the web map is then created using ArcGIS software.

Hardware Design

The Wemos D1 Mini system control is a WIFI-based development board module designed as a sensor controller. Sensor reading data are processed on the sensor controller. The outline of the series of systems designed for measuring the phosphorus levels in paddy soil is shown in Figure 3.

Figure 3

Wemos D1 Mini system series and TCS3200 sensor.

Hardware Design of Paddy Soil Phosphorus Measurement System

The design of this measuring instrument program uses Arduino Uno. The program results are then uploaded to the Wemos D1 Mini board. The software design is shown in the flowchart in Figure 4. In the workflow of this paddy soil phosphorus meter, when the measuring instrument is running, the sensor will immediately read the room value, which will be used as the sensor' s default value for calibration. If the calibration is successful, it will display a command to insert the extracted soil sample to read the red, green, and blue (RGB) values. The classification process is carried out with the Naïve Bayes algorithm. The results of the classification process are displayed on the LCD, and the RGB values of the soil samples are sent to ceerduad.ac.id, a web server, in the form of low, medium, or high phosphorus status. The design of the phosphorus measurement tool is shown in Figure 5.

Figure 4

Flowchart of the phosphorus detection program in this system.

Figure 5

Design of Phosphorus Measurement Tool System.

Results and Discussion
Determination of Soil Phosphorus Status

From the readings of the phosphorus level obtained using the TCS3200 sensor for the 20 soil samples extracted, this study obtained the RGB values shown in Table 1.

RGB values of the paddy soil sample reading

Sample Testing Red Green Blue Proposed system status PUTS status
1 1 147 126 90 High High
2 148 127 92 High High
2 1 145 132 94 High High
2 143 126 90 High High
3 1 121 96 73 Medium Medium
2 107 95 72 Medium Medium
4 1 127 110 78 High Medium
2 123 107 74 Medium Medium
5 1 130 110 76 High High
2 128 108 78 High High
6 1 135 116 83 High High
2 138 119 84 High High
7 1 147 128 90 High High
2 144 123 87 High High
8 1 143 124 91 High High
2 142 123 90 High High
9 1 135 114 83 High High
2 135 116 84 High High
10 1 118 105 77 Medium Medium
2 120 103 75 Medium Medium
11 1 116 100 73 Medium Medium
2 117 101 73 Medium Medium
12 1 118 102 78 Medium Medium
2 117 99 74 Medium Medium
13 1 143 122 85 High High
2 150 126 88 High High
14 1 128 109 78 High High
2 131 110 78 High High
15 1 127 106 74 High High
2 132 108 74 High High
16 1 138 114 78 High High
2 138 114 78 High High
17 1 124 103 73 Medium Medium
2 125 104 75 Medium Medium
18 1 198 163 120 Medium Medium
2 146 121 86 High High
19 1 131 110 76 High High
2 131 108 75 High High
20 1 120 102 73 Medium Medium
2 119 102 74 Medium Medium

The RGB values in Table 1 constitute the data snippet from the 200 training data used for reading phosphorus levels on the PUTS color chart. For determining the phosphorus level status, 20 soil samples were used, and each soil sample contained 10 training data for sensor readings. The graph of RGB values obtained from these 20 samples is shown in Figure 6. The soil sample measurements produced 200 experimental data values, out of which 194 measurement data obtained using a phosphorus measuring instrument based on PUTS measurements were valid and six experimental results were not suitable. From the results obtained, the error value can be calculated using Eq. (2) as follows: Error=Numb.ofPUTSTestNumb.ofTestingofMeasuringNumb.ofPUTSTests×100%Error=3% \matrix{{Error = {{Numb.\,of\,PUTS\,Test - Numb.\,of\,Testing\,of\,Measuring} \over {Numb.\,of\,PUTS\,Tests}} \times 100\%} \hfill \cr {Error = 3\%} \hfill \cr}

Figure 6

Soil Sample Test.

Using the Naïve Bayes equation, we found that the error rate of the measuring instrument was 3%, and the instrument accuracy was 97%. This result was compared with previous research that was applied for nitrogen monitoring [44], with an accuracy of 87.5%. To measure the accuracy, other studies used the coefficient of determination (R2), as shown previously [49] for monitoring of the leaf nitrogen concentration of wheat, with accuracy of 0.91%. Another study [50] calculated the accuracy of the sensor by using the coefficient of determination (R2) for measuring the total nitrogen content in agricultural runoff. [53] also proposed the monitoring of wheat grain nitrogen content with the coefficient of determination (R2), with a result of 0.42%.

Naïve Bayes Analysis

The Naïve Bayes method is used to calculate the probability of an event occurring in the future based on previous experience [51], [52]. Naïve Bayes analysis was used to classify the test results of the 20 soil samples to obtain the paddy soil phosphorus status. Based on the graph in Figure 6, the result obtained was as follows:

P(c(MEDIUM)) = 68 / 200 = 0.34 ;

P(c(HIGH)) = 132 / 200 = 0.66.

Tables 24 show the RGB value probabilities from the research results.

Red Odds

Range Medium High P(rse) P(rti)
1–63 0 0 0/68 0/132
64–127 66 6 66/68 6/132
128–190 0 127 0/68 127/132
191–255 0 1 0/68 1/132

Green Odds

Range Medium High P(gse) P(gti)
1–63 0 0 0/68 0/132
64–127 66 120 66/68 120/132
128–190 0 14 0/68 14/132
191–255 0 0 0/68 0/132

Blue Odds

Range Medium High P(bse) P(bti)
1–63 0 0 0/68 0/132
64–127 66 134 66/68 134/132
128–190 0 0 0/68 0/132
191–255 0 0 0/68 0/132

Table 2 shows the probability of phosphorus status in each class of red values divided into four ranges (1–63, 64–127, 128–190, and 192–255). Classification results based on red odds indicate the possibility of a medium or high level of phosphorus content. The probability of the green value class is shown in Table 3.

Table 3 shows the probability of the phosphorus status in each green value class divided into four classes with moderate- and high-phosphorus-status probabilities. The RGB reading value ranges from 1 to 255. The probability of the blue value status is shown in Table 4.

Table 4 shows the probability of phosphorus status in each blue value class divided into four classes with moderate and high phosphorus-status probabilities. The RGB reading value ranges from 1 to 255.

Example: The RGB value read by the sensor in the phosphorus meter: x={R=126;G=111;B=81}. x = \{R = 126;\,G = 111;\,B = 81\}.

Then, the phosphorus status of the rice fields is determined using Eq. (1).

The first step is to determine the probability of medium- and high-value P(c) of all training data used for reading the paddy soil sample: P(S)=68/200;P(T)=132/200. P(S) = 68/200;P(T) = 132/200.

Next, we determine the P(x|c) value of the medium- and high-phosphorus-status probabilities of the known data classes:

Medium-phosphorus-status probability P(rse)=30/68;P(gse)=68/68;P(bse)=70/68. P(rse) = 30/68;P(gse) = 68/68;\,P(bse) = 70/68.

High-phosphorus-status probability P(rti)=122/132;P(gti)=70/132;P(bse)=109/132. P(rti) = 122/132;P(gti) = 70/132;\,P(bse) = 109/132.

For determining the P(x|c) P(c) value:

MEDIUM P(c)P(se)=3068×6868×7068×68200=0.15. P(c)P(se) = {{30} \over {68}} \times {{68} \over {68}} \times {{70} \over {68}} \times {{68} \over {200}} = 0.15.

HIGH P(c)P(ti)=122132×70132×109132×132200=0.26. P(c)P(ti) = {{122} \over {132}} \times {{70} \over {132}} \times {{109} \over {132}} \times {{132} \over {200}} = 0.26.

The occurrence of the set probability value x in the entire data set is as follows: P(x)=152200×138200×179200=0.58. P(x) = {{152} \over {200}} \times {{138} \over {200}} \times {{179} \over {200}} = 0.58.

For determining the probabilities of moderate and high phosphorus status in the set value x, we proceed as follows: Moderate:P(se|x)=0.150.58=0.25; \matrix{{{\rm{Moderate}}:} \hfill & {P(se|x) = {{0.15} \over {0.58}} = 0.25;} \hfill \cr} High:P(ti|x)=0.260.58=0.44; \matrix{{{\rm{High}}:} \hfill & {P(ti|x) = {{0.26} \over {0.58}} = 0.44;} \hfill \cr}

So, the value P(se|x) > P(ti|x).

It can be concluded that for RGB value x={R=126;G=111;B=81}=FosforTinggi. x = \{R = 126;\,G = 111;\,B = 81\} = Fosfor\,Tinggi.

The data obtained from the average RGB values of the 20 samples of paddy fields in Figure 6 that have been tested and classified using the Naïve Bayes algorithm are shown in Table 5.

Table 5 shows the average RGB values of all soils that were tested and classified using the Naïve Bayes algorithm. The classification results yielded seven moderate phosphorus statuses and 13 high phosphorus statuses.

Average RGB values of paddy fields

Sample Average Red Average Green Average Blue Status
1 149.5 128.1 91.3 High
2 146.2 128.3 91.4 High
3 110.7 95.5 72.2 Medium
4 123.6 107.8 77.0 Medium
5 130.8 110.3 78.4 High
6 134.5 115.0 82.2 High
7 144.3 124.2 87.6 High
8 145.3 126.7 92.1 High
9 138.8 119.4 85.8 High
10 119.9 103.7 76.9 Medium
11 118.7 103.4 75.1 Medium
12 120.5 103.2 75.9 Medium
13 148.9 125.7 87.7 High
14 134.3 112.3 80.3 High
15 139.2 115.1 79.1 High
16 139.2 113.9 79.9 High
17 122.8 103.4 72.4 Medium
18 143.0 119.2 85.4 High
19 132.4 109.0 75.8 High
20 110.5 93.9 67.0 Medium

A graph of the average RGB values of the test samples can be obtained from the classification results, as shown in Figure 7 for the 20 soil samples. Soils with medium phosphorus values are present in the 3rd, 4th, 10th, 11th, 12th, 17th, and 20th samples. High phosphorus values are indicated by the RGB values of the 1st, 2nd, 5th, 6th, 7th, 8th, 9th, 13th, 14th, 15th, 16th, 18th, and 19th samples.

Figure 7

Chart Showing Average Value of Soil Phosphorus.

Data Transfer to The ceerduad.com web

The data are sent to the ceerduad.com web server using the esp8266 WiFi module for sending the sensor reading data. The view of the ceerduad.com server is shown in Figure 8.

Figure 8

The first display of ceerduaad.com website.

After successfully logging into the ceerduad.com website, a graph of the measurements of the paddy soil sample, which were read by the TCS3200 sensor, will be displayed according to the data that have been sent by the phosphorus-measuring instrument. The graphic image of the RGB value data that were successfully sent to the website is shown in Figure 9.

Figure 9

Graph displayed on the website.

The graph in Figure 9 represents the sensor data sent from the phosphorus meter to the website based on the phosphorus meter reading. The graphic data in Figure 9 can be downloaded in Excel form. The graph shows the RGB values from the sampling at 20 rice fields in Lindahl District. Soil sample data are then extracted with PUTS and tested using a phosphorus meter that has been developed.

ArcGIS Mapping Design

Mapping is carried out to map the results of reading of the phosphorus levels in the soil samples at a specific location using several steps, as shown in Figure 10.

Figure 10

ArcGIS mapping diagram block.

ArcGIS mapping was carried out by taking satellite imagery of the mapped location using Google Earth. The location pictures captured by the satellite are then used to create an SHP data (shapefile) map. The shapefile data that have been created are given an input of several attribute data scores, which then enter the stage of merging of several spatial elements into new spatial elements (overlay). The process of obtaining the cartographic map layout is completed in various steps, starting from the overlay process to the merging stage of several spatial elements into one spatial element without changing the combined spatial elements (union).

Paddy Soil Sampling Map

The sampling of paddy soil was carried out in the Lendah district, divided into 20 test locations, where each point represented one sample of paddy soil. The map of the paddy soil sampling distribution is shown in Figure 11.

Figure 11

A map of the paddy soil sampling locations.

ArcGIS software created a map of the sampling locations in Figure 11. ArcGIS is a processing software based on geographic data, which can present, manipulate, and save geographic information data. Some main features of the ArcGIS software in mapping include the ArcMap and ArcCatalog. ArcMap is used in data management, including visualization, editing, and map-making, while ArcCatalog is a feature used for creating vector data, raster data, and grouping according to its function.

Paddy Soil Phosphorus Status Map

The mapping of the paddy fields' phosphorus status was accomplished based on the soil sampling location and phosphorus status obtained during the testing of the paddy soil samples, as shown in Figure 12.

Figure 12

Paddy soil phosphorus status map based on the sampling location.

From the mapping of the paddy soil phosphorus status in Figure 12, it can be seen that of the 20 locations of paddy soil sampling, 13 locations were of dark blue paddy fields and seven light blue locations. A dark blue indicator on the paddy soil phosphorus status map above indicates a high status of phosphorus, while the light blue color indicator on the map above indicates the moderate status of phosphorus.

Geographical Map of Soil Phosphorus Levels in Lendah District

This geographical map contains information that researchers prepared to facilitate use by readers, such as rivers, roads, village boundaries, inland waters, plantation villages, and rice fields. This map aims to focus on the nutrient content in terms of soil phosphorus in the rice fields of Lendah district. The final result of mapping of phosphorus levels in the Lendah District is shown in Figure 13.

Figure 13

Map of Paddy Soil Phosphorus Levels in Lendah District.

In making the phosphorus content map, Figure 13 used the raster and vector data. Raster data is in the form of squares or cells. Raster data like a image data presented in the form of jpg and unitary coordinates in square. Vector data are in the form of spatial data such as points, lines, and areas. On a map, raster data form an image in jpg format, while vector data are in the form of roads, rivers, village boundaries (in the form of lines), villages, and rice fields (data that have a large area).

Conclusion

Based on the results of testing with the developed tools, we can conclude that the paddy soil phosphorus-level measurement tool can measure the paddy soil phosphorus status with an error rate of 3%, and the success rate reaches 97%. The RGB values of the 20 paddy soil samples in Lendah District with 200 readings of the phosphorus-measuring instrument showed medium and high phosphorus statuses in the soil samples taken. The weakness of the Naïve Bayes algorithm in this study is that if one of the variables used is zero, then it is final the result of the data obtained will be zero. Even if only one data point is worth zero, all data will be affected. An alternative solution to overcome the drawbacks of using the Naïve Bayes method can be modified with the Laplacian correction algorithm to avoid the probability value of zero.

eISSN:
1178-5608
Lingua:
Inglese
Frequenza di pubblicazione:
Volume Open
Argomenti della rivista:
Engineering, Introductions and Overviews, other