Open Access

Optimizing urine protein detection accuracy using the K-nearest neighbors algorithm and advanced image segmentation techniques

, , , , , , , , ,  and   
Jul 26, 2025

Cite
Download Cover

Figure 1:

Protein detection computer program. KNN, K-nearest neighbors.
Protein detection computer program. KNN, K-nearest neighbors.

Figure 2:

Evaluation of the KNN model. KNN, K-nearest neighbors.
Evaluation of the KNN model. KNN, K-nearest neighbors.

Figure 3:

(A) Image of the prototype seen from the outside, (B) prototype components seen from the inside, and (C) shape of the prototype that is ready to be used.
(A) Image of the prototype seen from the outside, (B) prototype components seen from the inside, and (C) shape of the prototype that is ready to be used.

Figure 4:

Distribution of 30 test data.
Distribution of 30 test data.

Figure 5:

Results of confusion matrix values at K = 3.
Results of confusion matrix values at K = 3.

Figure 6:

Results of confusion matrix values at K = 10.
Results of confusion matrix values at K = 10.

Figure 7:

Results of confusion matrix values at K = 20.
Results of confusion matrix values at K = 20.

Comparison of research results

No. Biomarker Author and year Color classification Work principle Ref.
1. Albumin Thakur (2021) RGB, HSV, and Lab RF algorithm to estimate albumin concentration using a smartphone [32]
2. Albumin Thakur (2022) RGB, HSV, and Lab CNN algorithm for classifying Color in detecting albumin using a smartphone. [41]
3. Albumin Kim (2022) RGB RGB extraction uses machine learning and iPhone 11 as a means of detecting color in urine. [42]
4. Protein This study (2023) RGB Protein detection equipped with a digital color sensor type ELP camera. Image data are classified based on RGB and evaluated using the KNN algorithm

Evaluation of the KNN model

K value Accuracy (%) Precision (%) Recall (%) F1 score (%)
3 96.7 97.0 96.7 96.2
10 86.7 75.8 86.7 80.7
20 76.7 60.9 76.7 67.3

Preparation of sample solutions

No. Protein (g) Water (mL) Output strip
1. 0.00 20 Negative (−)
2. 1.00 20 Plus-minus (+−)
3. 3.00 20 Positive 1 (+)
4. 5.00 20 Positive 2 (++)
5. 7.30 20 Positive 3 (+++)
6. 11.60 20 Positive 4 (++++)

Training and test data

Label/class/category Amount of data Information protein content (g/L)
6 0
+− 24 0.15
+ 10 0.3
++ 22 1
+++ 30 3
++++ 7 20
Language:
English
Publication timeframe:
1 times per year
Journal Subjects:
Engineering, Introductions and Overviews, Engineering, other