Diabetes mellitus is one of the diseases with the highest number of sufferers in the world. According to the World Health Organization (WHO), as many as 415 million people suffered from this disease in 2019 [1]. Early diagnosis and ongoing control are important for diabetics to ensure a healthy life and to avoid complications and death [2].
Measurement of glucose levels in the urine is carried out to prevent the entry of bacteria into the body through a syringe [3]. Obesity triggers an increase in glucose due to the accumulation of fatty tissue in the body [4]. When glucose levels in the blood are >180 mg/100 mL, glucose is extracted by the kidneys and flushed out through the urine [5,6]. The level of glucose contained in the urine is thereafter determined.
This study designed a prototype to detect glucose in urine using the AS7262 sensor as the main component for detecting the color of urine specimens. These specimens are grouped into five classes, namely, Normal, Positive 1, Positive 2, Positive 3, and Positive 4. Urine produces color changes according to glucose levels based on Benedict's test [7]. The analysis of this research is strengthened using machine learning (ML) with the
This system is designed by researcher to detect the color of the specimen using the AS7262 sensor based on the intensities of six colors. They are violet, orange, blue, green, yellow, and red. The supporting components are installed in this system including: the DS18B20 sensor to regulate temperature, a heater to process the specimen, van as a coolant to control the temperature when overheated, buzzer and light-emitting diode (LED) as markers of each process, printed circuit board (PCB) as a microcontroller, liquid crystal display (LCD) to show the menu and the result, a button to control the system, and a switch to turn on or turn off the system. Block diagrams of the system design are made to explain the overall concept specifically of the prototype shown in Figure 1.
Figure 1 shows that this prototype uses a digital communication line, while the AS7262 sensor uses the Inter Integrated Circuit System (I2C). Note that the AS7262 and DS18B20 sensors are input. The value generated from DS18B20 is used to set the temperature on the heater to work automatically, and the AS7262 value is used as a dataset in the next process. The wiring diagram is shown in Figure 2.
Figure 2 shows that each pin on the Arduino Nano is connected to each component in order to control the performance of each component through a program using digital communication. The flow of data collection specifically can be seen in Figure 3.
Figure 3 shows that after the initialization process, the temperature is set and the heating process is started. This process produces color in the specimen and is detected using the AS7262 sensor, with the resulting data displayed on the LCD and sent to the serial monitor. The results of the data are used as a dataset to be processed in the next step.
The data generated by the sensor are processed during ML using the KNN algorithm. The data process in ML is shown in Figure 4.
ML is used to extract relevant data [8]. The dataset used contains the results of the AS7262 sensor, which is converted into comma-separated value (csv) form so that it can be read using Python program. The data are processed for classification to determine the glucose levels based on the color of the specimen. After the data are classified, the algorithm begins to predict, and the accuracy of the classification process is determined.
The specimens in this study were mixtures of urine processed by Benedict's method to produce a colored precipitate, which was assayed according to the grade of the total glucose level. Benedict's test refers to the provisions contained in previous studies (Pratiwi et al. 2020), with the test using as many as five urine samples with different glucose levels. The difference between such studies and the current study lies in the use of different sensors; previous studies used photodiode sensors to detect colors, while the current study uses AS7262 sensors to detect colors [9]. Benedict's test sample was prepared by using 10 mL of Benedict's solution and 20 mL of urine specimen (containing 0.4 mL, which is equivalent to eight drops). This test is done to detect the glucose content in the urine. The calculation of the concentration is done by using Eq. (1).
Here,
The working principle of KNN is to make predictions based on the proximity of the object's characteristics to those of the neighborhood training data closest to the object [10, 11]. This algorithm works by determining the input in the form of training data, testing data, and the value of
The advantage of this method is that data are classified based on the nearest neighbor class. The value of “
Figure 5 shows the data, which consist of two classes, namely, the red class and the green class. The test data are indicated by a blue arrow, with a value of
Precision is defined as a measure of accuracy to predict all data samples [20]. The precision value is calculated by using Eq. (4).
Here, TP indicates true positive, and FP indicates false positive.
Recall is a completeness measure of the sample scale taken with all significant samples [20]. The recall value is calculated by using Eq. (5).
Here, FN indicates false negative.
Description of the variables:
TP = true positive; TN = true negative; FP = false positive; FN = false negative.
TP indicates the presence of a characteristic, TN indicates the absence of a characteristic, FP indicates the presence of a certain condition, and FN indicates the absence of a certain condition.
Tests carried out on urine samples containing glucose with Benedict's test resulted in five specimen colors, classified as Normal, Positive 1, Positive 2, Positive 3, and Positive 4. Classes were characterized by grade, color yield, and glucose levels. The results of the characterization can be seen in Table 1.
Glucose characterization
1. | Sample 1 | 0–0.5 | 0 | Slightly greenish blue and a bit cloudy | Normal |
2. | Sample 2 | 0.5–1 | 0.2 | Yellowish green | Positive 1 |
3. | Sample 3 | 1–1.5 | 0.3 | Greenish yellow | Positive 2 |
4. | Sample 4 | 2–3.5 | 0.4 | Slightly brownish orange | Positive 3 |
5. | Sample 5 | >3.5 | 1 | Slightly brownish brick red | Positive 4 |
Based on this classification, it can be observed in Table 2 that the urine specimens in each class have different characteristics. This is caused by the amount of glucose levels contained in the urine. Calculation of glucose concentration is done using Eq. (1).
Results of the data from the sensor
21.34 | 23.36 | 18.68 | 19.64 | 14.02 | 6.66 | Normal | |
14.93 | 13.53 | 24.27 | 21.48 | 19.23 | 6.53 | Positive 1 | |
12.19 | 10.07 | 22.63 | 24.14 | 21.87 | 9.05 | Positive 2 | |
11.12 | 8.21 | 16.97 | 21.91 | 26.02 | 15.74 | Positive 3 | |
13.68 | 9.93 | 22.58 | 20.77 | 22.58 | 10.43 | Positive 4 |
This process is carried out using a prototype design made with the aim of making it easier for users to retrieve specimen data in terms of time and place. This prototype is made using components that function to help the prototype carry out its functions properly. The prototype in this study is shown in Figure 6.
Figure 6(a) shows the output results for each input displayed on the LCD. Figure 6(b) is the layout of the component series for the data collection process. Figure 6(c) shows the main component AS7262 sensor.
Data testing is carried out to test the accuracy of the sensor by identifying the color intensity of the specimen based on six indicators. Testing this prototype resulted in 1,200 data on each specimen. The total number of data generated by the five specimens is 6,000. The data results are shown in Table 2.
Table 2 shows some of the data results obtained from the AS7262 sensor. The resulting data are quite stable because the sensor produces color intensity data that matches the color of the specimen. The data are displayed in graphical form in Figure 7.
Classification is the process of analyzing the same model on a set and classifying it into different classes [22,23]. The data are classified into five classes based on the glucose levels, with a
Figure 8 shows that the higher the graph, the more are the error data. The graph moves up and down because of the error data for the value of
Figure 9 shows the results of the predicted values generated from the test data, training data, and the value of
Confusion matrix serves to provide information on the comparison of the classification results carried out by the system with the actual classification results [25].
The KNN algorithm can be quite effective in classifying the glucose class. In this study, each glucose level was determined by color matching based on the intensity of the colors violet, blue, green, orange, yellow, and red. The results obtained from the comparison made between the classification carried out by the system and the actual classification results showed a fairly large accuracy of 96.33%.
This system is designed to be used by many people, not necessarily diabetics because the examination is carried out without causing physical harm and other appliances do not come into direct contact with the body. In the future, we aim to create a noninvasive blood sugar–checking system with simpler hardware size and weight so that it is easier to carry anywhere.