Open Access

Research on tobacco composition analysis and ratio optimization strategy using data mining technology

,  and   
Mar 19, 2025

Cite
Download Cover

Figure 1.

Platform architecture based on big data mining
Platform architecture based on big data mining

Figure 2.

The result of clustering of tobacco physical traits in different habitats
The result of clustering of tobacco physical traits in different habitats

Figure 3.

The main component load of each chemical component
The main component load of each chemical component

Figure 4.

Major chemical analysis clustering
Major chemical analysis clustering

Chemical composition of tobacco tobacco

PC1 PC2 PC3 F
a -1.107 7.974 -6.422 0.883
b -1.644 7.71 -6.57 0.516
c -1.32 8.651 -6.281 1.02
d -1.069 8.09 -6.163 0.981
e -1.285 7.363 -6.007 0.682
f -1.52 8.417 -6.178 0.864
g -1.785 8.587 -6.298 0.771
h -2.161 9.382 -6.926 0.746
i -2.428 11.858 -6.711 1.434

The eigenvectors in the top five main components

Index Thickness Principal component
1 2 3 4 5
Blade density 0.81 0.047 0.303 0.033 0.206
Single leaf mass 0.851 0.003 0.116 0.224 0.24
Balanced moisture content 0.769 −0.102 0.033 −0.096 −0.173
Pull 0.154 0.319 −0.676 0.608 0.075
Elongation 0.028 0.818 0.182 −0.214 0.322
Fill value −0.197 0.84 0.039 −0.065 −0.259
Stemmatation −0.3 0.085 0.674 0.62 −0.174
Index −0.732 −0.183 0.078 0.093 0.505

Principal component factor load matrix

Constituent Main component factor load
1 2 3
Total nitrogen / % 0.922 0.079 −0.097
Chlorine / % 0.848 −0.092 −0.257
Total sugar 0.859 0.499 0.21
Reduced sugar / % 0.781 0.235 0.593
Potassium chlorbium −0.761 0.378 0.497
Potassium / % −0.746 0.586 0.265
Nicotine / % 0.197 1.004 −0.312
Glycosoda ratio 0.472 0.881 −0.094
Nitrogen base ratio −0.621 0.764 −0.116
Bisugar ratio 0.387 −0.169 0.943
Characteristic root 4.624 3.1 1.69
The percentage of variance / % 46.319 30.937 16.644
Cumulative contribution rate 46.319 77.256 93.900

The test of the best formula for the best raw material of tobacco

Serial Number Odor Tune Volume Of Aroma Aroma Quality Offensive Odor Aftertaste Irritating Total Score
1 4 19 8 7 19.5 15 16 86
2 4 17.5 9.5 7 18.5 15 16 85.5
3 4 16.5 10.5 7 17 15 16 82.5
Average 84.5

Standardized concentration

Sample Total Sugar Reduction Sugar Total Nitrogen Total Vegetable Base Total Vegetable Base Chloride Ion
Conventional Tobacco 9.25 7.84 1.54 1.15 2.54 0.88
Ours 7.88 7.65 0.33 0.15 0.75 0.36

Descriptive analysis of chemical constituents in different producing areas

Index a b c d e
Soluble sugar M ± SD 30.54±1.89 32.72±4.96 31.73±4.43 30.38±2.79 33.62±2.03
CV 12.34 18.35 34.57 9.47 28.29
Reduction sugar M ± SD 26.65±4.86 28.34±4.03 26.74±3.87 26.3±0.92 30.64±4.77
CV 17.6 21.18 17.14 11.54 11.97
Total nitrogen M ± SD 1.93±1.79 1.88±0.17 1.5±0.24 1.68±0.55 1.79±3.36
CV 0.06 0.16 0.05 0.01 0.05
Ncotine M ± SD 1.8±1.47 2.51±1.33 2.81±3.38 2.74±3.36 2.04±3.34
CV 0.14 0.38 0.22 0.05 0.16
Potassium M ± SD 1.92±2.2 2.36±2.85 2.4±0.17 2.29±3.45 2.79±3.31
CV 0.61 0.15 0.23 0.2 0.28
Clorine M ± SD 0.37±4.76 0.4±4.81 0.22±2.35 0.23±0.6 0.61±4.66
CV 0.04 0.04 0.01 0.07 0.05
Sarch M ± SD 3.33±1.3 2.53±0.55 4.42±2.43 2.75±2.12 3.93±1.37
CV 0.22 0.4 1.17 0.76 0.42
Glycosoda ratio M ± SD 14.69±4.98 15.3±0.27 13.66±4.09 12.53±3.98 14.62±4.61
CV 24.04 33.89 21.59 6.33 17.09
Nitrogen base ratio M ± SD 1.33±4.98 1.15±1.45 0.68±3.79 1.07±1.01 1.07±0.78
CV 0.03 0.08 0.04 0.01 0.09
Potassium chlorbium M ± SD 5.53±2.58 10.01±4.8 8.33±1.79 7.56±3.69 9.97±4.36
CV 3.26 23.08 50.78 14.56 13.19

Correlation coefficients between chemical indexes

Index 1 2 3 4 5 6 7 8 9 10
1. Total sugar 1
2. Reduction sugar 0.707 1
3. Total nitrogen −0.168 −0.312 1
4. Nicotine −0.499 −0.452 0.343 1
5. Potassium 0.367 −0.01 0.395 −0.292 1
6. Chlorine −0.157 −0.079 0.298 −0.073 0.13 1
7. Starch −0.101 −0.038 −0.242 0.11 −0.28 −0.054 1
8. Glycosoda ratio 0.779 0.641 −0.272 −0.859 0.334 0.029 −0.128 1
9. Nitrogen base ratio 0.377 0.239 0.306 −0.789 0.533 0.213 −0.265 0.747 1
10. Potassium chlorbium 0.273 −0.027 0.063 −0.135 0.418 −0.666 −0.088 0.16 0.142 1

The correlation coefficient matrix of each index of different natural properties

Index Thickness Blade density Single leaf mass Balanced moisture content Pull Elongation Fill value Stemmatation
Thickness 1 0.682** 0.494** −0.023 0.062 −0.054 −0.068 −0.441**
Blade density 0.685 1 0.524** 0.144* 0.031 −0.165* −0.102 −0.488**
Single leaf mass 0.506** 0.502** 1 0.032 −0.041 −0.169* −0.206* −0.461**
Balanced moisture content −0.024 0.166* 0.044 1 0.053 0.144 −0.123 −0.131
Pull 0.078 0.042 −0.042 0.051 1 0.453** 0.035 −0.085
Elongation −0.074 −0.161* −0.162* 0.134 0.463** 1 0.086 −0.021
Fill value −0.082 −0.116 −0.202* −0.116 0.025 0.098 1 0.194*
Stemmatation −0.439** −0.489** −0.449** −0.123 −0.113 −0.024 0.207* 1

The eigenvalues and contribution rate of each main component

Principal component eigenvalue Contribution rate Cumulative contribution rate
1 2.608 32.60% 32.60%
2 1.572 19.65% 52.25%
3 1.175 14.69% 66.94%
4 0.894 11.18% 78.12%
5 0.579 7.24% 85.36%
6 0.416 5.20% 90.56%
7 0.41 5.13% 95.69%
8 0.346 4.31% 100%

The harmful content of mainstream smoke

Sample Carbon monoxide Hydrocyanic acid benzpyrene Crodal aldehyde phenol Ammonia gas NNK Hazard index
Conventional Tobacco 12.2 95 4.21 18.65 3.08 3.9 7.05 6.65
Ours 10.5 16 5 2.1 3.27 3.2 9.45 5.21
Language:
English