Accesso libero

Application of Chinese medicine evidence classification algorithm in the identification and treatment of Parkinson’s disease

  
05 giu 2025
INFORMAZIONI SU QUESTO ARTICOLO

Cita
Scarica la copertina

Introduction

Chinese medicine, which originated in China, has a history of thousands of years and is the traditional medicine of China, which is based on the theory and practical experience of Chinese medicine of the Han Chinese people in ancient China. Chinese medicine is a comprehensive science, the goal is to study the laws of human disease and health transition, including human health care and rehabilitation, diagnosis and treatment of disease [1]. Among them, “evidence” is a specialized term in Chinese medicine, which is a general term for a series of interrelated symptoms. It is a general term for a series of interrelated symptoms, which can be obtained through the four diagnostic methods of observation, smelling, questioning, and cutting, and is referred to as evidence or symptoms [2-4]. There are six major categories, including the eight principles, qi and blood, six meridians, Wei qi and blood, San Jiao, and viscera, as well as several subcategories [5-8]. In the clinical application of TCM syndrome classification, it is important to be able to accurately diagnose patients with different disease conditions, i.e., comprehensive syndrome classification.

Classification algorithms are an important technique in machine learning, whose main goal is to classify the samples in a dataset into predefined categories [9]. In many real-world applications, classification algorithms are widely used in text categorization, image categorization, spam filtering, recommender systems, and so on [10-13]. The essence of classification algorithms is to construct a classifier model by training samples, and then use the model to classify new unknown samples. According to different feature representations and classification ideas, classification algorithms can be classified into multiple categories, such as decision tree algorithms, plain Bayesian algorithms, K-nearest neighbor algorithms, logistic regression algorithms, support vector machine algorithms, neural network algorithms, and integrated learning algorithms, such as random forests, and gradient boosting trees, etc. [14-17]. The classification of TCM symptoms is complex and varied, and there may be judgmental errors for academic ineptitude or complex patient symptoms, resulting in medical malpractice. Therefore, the use of classification algorithms can assist medical practitioners in the classification process.

Bayesian classification is an important classification algorithm in data mining, and the use of Bayesian classification method for Parkinson’s disease Chinese medicine evidence identification is of practical significance for the identification and treatment of Parkinson’s disease. Based on the Bayesian principle, this paper completes the correlation analysis of the TCM evidence classification algorithm, captures the key issues of evidence identification, and introduces the topic model framework into the TCM evidence classification algorithm to improve the efficiency of the algorithm. Finally, the ROC curve of the improved TCM evidence classification algorithm model is compared with DenseNet121 and DAMNet model to evaluate the performance of the improved TCM evidence classification algorithm model. This paper has an important comparative value in the TCM evidence classification algorithm for the identification and treatment of Parkinson’s disease, and it can also be used as a precedent work and an important foundation for the subsequent clinical use of the TCM evidence classification algorithm.

Overview

Parkinson’s disease is a neurodegenerative disease, which is often characterized by motor slowing and stiffness [18]. And Chinese medicine, as a traditional Chinese medical art, has begun to be applied in the treatment of Parkinson’s disease, such as Chinese herbs [19] and acupuncture [20]. TCM evidence is to analyze the patient’s systemic symptoms and carefully determine the disease in order to personalize the treatment [21]. According to the classification of TCM syndromes, Parkinson’s belongs to the type of qi and blood deficiency, liver and kidney yin deficiency, wind and yang internal movement, qi deficiency and blood stasis, and phlegm-heat and wind movement [22]. Moreover, these symptoms of Parkinson’s present different symptoms in the early, middle and late stages, for this reason, the literature [23] combined TCM symptoms in a segmented manner, which was the first clinical treatment of TCM symptoms and Parkinson’s. Similarly, literature [24] used the classification of TCM syndromes for pattern recognition of Parkinson’s and analyzed the nature and location of Parkinson’s disease based on pattern recognition using hierarchical clustering. It was found that these analyses provided evidence for the pathogenesis of Parkinson’s. Literature [25] also mentioned that the use of hierarchical clustering can provide an accurate distinction between Parkinson’s disease and TCM evidence. And literature [26] utilized convolutional neural network based to design another intelligent dialectic model which contains all the classifications of TCM syndromes and is able to differentiate between the conditions that it does not have. And literature [27] utilized residual structure graph convolutional network to combine patient state elements with symptoms in order to classify the evidence and it is better than algorithms such as support vector machine, random forest and convolutional neural network. With the complexity of Parkinson’s disease, such algorithms can be combined.

Bayesian approach
Knowledge about Bayesian theory
Conditional probability and the multiplication theorem

Conditional probability is an important concept in probability theory. Suppose X , Y are two events, the probability of event Y occurring in the presence of event X is called the conditional probability of event Y in the presence of a given event X (also known as the posterior probability), denoted as P(Y|X) . In contrast, P(X) is known as the a priori probability. Conditional probability is defined as follows:

Let events X , Y be two events and P(X)>0 , denoted as P(Y|X)=P(XY)P(X)

Is the conditional probability of event Y conditional on event X occurring.

From Definition 1 of conditional probability, the multiplication theorem can be introduced.

Theorem 1 (Multiplication Theorem): Given any two events X , Y , and P(X)>0 , P(Y)>0 , there are P(XY)=P(Y|X)P(X)

The multiplication theorem applies not only to two events, but also to the case of multiple events, assuming that X1,X2,,Xn is a n -event, n2 , and PX1X2Xn1>0 , then there are PX1X2Xn=PXn|X1X2Xn1PXn1|X1X2Xn2×PX2|X1PX1

Full probability formulas and Bayes’ theorem

Theorem 2: Let the sample space of Experiment E be S , X be an event of E , Y1,Y2,,Yn be a division of S , and PYi>0 , i=1,2,,n , then for any event X there is P(X)=PX|Y1PY1+PX|Y2PY2++PX|YnPYn

Equation (4) is then called the full probability formula.

With the foundations of the conditional probability formula and the full probability formula, the next step down is to introduce Bayes’ theorem.

Theorem 3 (Bayes’ Theorem): Let the sample space of experiment E be S , X be the events of experiment E , Y1,Y2,,Yn be a division of S , and P(x)>0,PYi>0(i=1,2,,n) , then PYi|X=PX|YiPYij=1nPX|YjPYji=1,2,,n

Equation (5) is then called the Bayesian formula, i.e., Bayes’ theorem.

Extremely large a posteriori and great likelihood assumptions

Let P(h) be the a priori probability of hypothesis h , which refers to the initial probability of hypothesis h before the data are untrained; if this a priori knowledge is not known, each hypothesis can be assigned the same a priori probability. Let P(D) be the a priori probability of the sample data D , i.e., the probability D if hypothesis h is not determined to hold. P(D|h) denote the probability distribution of D in case hypothesis h is established. Calculate the probability of h given the training data D according to Bayes’ formula, which is the posterior probability of h . This is done as follows: P(h|D)=P(D|h)P(h)P(D)

Let H be the set of candidate hypotheses, i.e., hH . The great a posteriori hypothesis (MAP) is the hypothesis h that is the most probable when the learner acquires a given set of instances D , abbreviated as: hMAP , and formulated as: hMAP=argmaxhHP(h|D) can be obtained by using the Bayesian theorem: hMAP=argmaxhHP(D|h)P(h)P(D)=argmaxhHP(D|h)P(h)

Observing Equation (8), it can be observed that P(D) is removed, and since its existence does not depend on Assumption h , a normalization factor is generally used instead. Applying Eq. (8) allows for the most basic classification, determining the most likely category of a new instance based on the MAP assumption. Thus, all research around Bayesian classification models is based on the MAP assumption.

In some special cases, Eq. (8) can be further simplified when the prior probability of each hypothesis h in the hypothesis set H is equal (i.e., for any hi,hjH , there is a Phi=Phj ), at which point P(h) can be dropped. simplified to consider only the use of P(D|h) to compute the maximum likelihood hypothesis, and P(D|h) is referred to as the likelihood of the data D given the hypothesis h , so that the maximum likelihood is referred to as the maximum likelihood hypothesis (ML), shortened by: hML , and Eq. is: hML=argmaxhHPD|h

In practical data mining, data D is referred to as a training sample of a particular objective function, while hypothesis H is referred to as the space of candidate objective functions.

Improved Plain Bayesian Classification Algorithm

The AODE algorithm is an improved NBC algorithm with better classification performance, which is based on the data in the training set, and selects the parent attribute values from the attribute values of the test instances by statistical methods, so as to categorize the attributes in the training set into parent attributes and child attributes, and then assumes that the attribute values of an instance are independent of each other under the conditions of the given class labeling and the value of the parent attribute, which greatly relaxes the assumptions of the NBC algorithm conditions and improves its classification performance. The structure of AODE algorithm can be obtained from the above description as shown in Figure 1.

Figure 1.

AODE algorithm structure diagram

In summary, we can know that the AODE algorithm is using the idea of restricted single dependency estimation and aggregating the results of multiple classifiers to improve its classification accuracy and reduce the complexity of classifier construction. In the following paper, the basic principles of the AODE algorithm will be described in detail.

Given a test instance X=a1,,an , and XD , because of the attribute value ajX , the formula can be obtained: Pck,X=Pck,aj,X=Pck,ajPX|ck,aj

The formula can be obtained: Pck|X=Pck,ajPX|ck,ajP(X)Pck,ajPX|ck,aj

The AODE algorithm refers to a , as the value of the parent attribute of the test instance, based on Eq. (11) and the assumptions of the AODE algorithm can be obtained as Eq: PX|ck,aji=1nPai|ck,aj

And in Eq. (12) ai,ajX . Combining the above theory one can obtain the classifier formula constructed by the AODE algorithm for a test instance X as: C(X)=argmaxciCjjmFajgPck,aji=1nPai|ck,aj

In Eq. (13), Faj denotes the number of training instances in the training set where attribute Aj has a value of aj , and g is a constraint (taking a value of typically 30).

A computational advantage of AODE over the TAN algorithm or the SP-TAN algorithm is that it can be used directly for incremental learning. Updating an AODE classifier with a new training instance only requires increasing the joint attribute values and the relevant entries in the class labeling frequency table.

Algorithm for categorizing Chinese medicine symptoms
Classification process of TCM evidence classification algorithm based on Bayesian approach

Figure 2 demonstrates the classification process of the TCM evidence classification algorithm based on the Bayesian approach:

Figure 2.

Classification process of TCM syndrome classification algorithm

They go through four stages of data preparation, feature extraction, classification training and classification testing respectively, which will be explained in the following in the context of differential diagnosis of Parkinson’s symptoms in TCM.

Step-by-step explanation of the TCM evidence classification algorithm
Data preparation

Data is the basis of data mining work, and it is also a crucial part in the differential diagnosis of Parkinson’s evidence. In this project, 200 Parkinson’s clinical medical cases were collected from one of the top 100 hospitals in the country, with the time years spanning from 1999 to 2008; 60 Parkinson’s TCM medical cases were extracted from various TCM books. Due to time (both ancient and modern), individuals (many TCM schools), and lack of standards, there are many noisy data, redundant data, sparse data, and incomplete data in the collected medical cases. For this reason, we firstly excluded the cases with incomplete records and serious irregularities due to various reasons; we also excluded the cases whose main evidence was not Parkinson’s, and finally 215 valid cases were left. After that, the cases were standardized and described uniformly, and 190 cases were selected as the training set and the remaining 25 cases as the test set, and both the test set and the training set covered the five types of evidence.

Feature term extraction

Chinese medicine symptoms can be divided into three main categories according to their degree of influence on the disease: the characteristic symptom system, the primary symptom system, and the secondary symptom system. However, in the TCM symptom classification system, it is possible to ignore these differences and look at the symptoms in a unified way, which may be multiple expression values on a single concept, such as the symptom of fever, the possible values of which are low fever, moderate fever, and high fever, and when constructing the eigenvector space, the low fever, moderate fever, and high fever are taken as the three independent components, instead of using the three values of a single component. The advantage of this is that we can not have to go to the category and degree for each symptom in TCM, because TCM has no unified standard in this regard, a hundred schools of thought, each with their own opinion, so ignoring this concept of degree and treating it as different dimensions, although it will lose a certain amount of precision, but it is helpful for the rapid judgment of TCM evidence.

A short example

The conventions of this paper:

Each sample of an unknown category is represented by a n -dimensional vector X=h1,h2,,hn describing the sample’s n measures of the n symptomatic features H1,H2,,Hnand hi=1,Symptom Hi was present in the sample0,No symptoms in the sample Hirespectively;

The categories are represented by the variable C , where variable C takes the value of c . For m categories, the domain of c is 1,2,,m . If c=i , it means that a sample belongs to the category Ci , and P(C=i) is usually abbreviated to PCi ;

For a sample of a known category, we denote it by a n+1 -dimensional feature vector S=h1,h2,hn,c , which is an extension vector of the vector of samples of the unknown category X , where the meanings of hi and c are the same as the conventions in (1), (2).

A sample set consisting of samples of known categories is called a training sample set, and a training sample set T of size s is denoted as T=S1,S2,,Ss , where the meaning of Si is the same as defined in (3).

Application of Chinese medicine evidence classification algorithm in Parkinson’s disease diagnosis and treatment
Introduction to Parkinson’s Disease

Parkinson’s disease is a progressive neurodegenerative movement disorder characterized by the accumulation of abnormal α synaptic nucleoprotein in Lewy vesicles and the loss of dopaminergic neurons in the substantia nigra.The exact mechanism of the dominant neurodegeneration during the progression of the disease in PD is unknown and there is no known complete cure. Its clinical motor symptoms are characterized by tremor, tonus, bradykinesia, and postural instability.Accurate diagnosis of PD is a major challenge, especially in its early stages. In addition, distinguishing PD from other neurodegenerative disorders such as multiple system atrophy (MSA) can be difficult due to similar early symptoms. MSA is a α synucleinopathies disorder characterized by typical oligodendrocyte cytoplasmic inclusions. MSA is a disorder characterized by autonomic failure, Parkinson’s symptoms and/or cerebellar features, and compared to PD, MSA has more severe brain damage and faster progression. Therefore, improving the accuracy of diagnosing PD and the ability to differentiate it from similarly symptomatic disorders remains a major goal of clinical practice.

Improvement of TCM evidence classification algorithm
Problem modeling

In this section, we hope to model the process of diagnosing Parkinson’s disease in Chinese medicine as a Chinese medicine syndrome classification problem from the perspective of objectivization research, and we hope to automate the process of inferring the syndrome type from the characteristics of the scales by using the Chinese medicine syndrome classification algorithm, so as to provide auxiliary references for doctors and to promote the standardization of the diagnosis of Parkinson’s disease in Chinese medicine.

TCM has established five major categories of Parkinson’s disease: phlegm-heat-driven wind, blood stasis-driven wind, qi and blood deficiency, liver and kidney deficiency, and yin and yang deficiency, each of which can be categorized into primary and secondary syndromes. Parkinson’s patients usually suffer from a primary syndrome or both a primary and a secondary syndrome. The problem of determining the type of evidence a patient has is a multi-label classification problem if the type of evidence a patient has is labeled.

In this chapter, 91 symptoms obtained from “looking, smelling, asking, and cutting the tongue” in the TCM scale are used as data features, and the corresponding features are represented as “1” if the symptoms are present, and “0” if the symptoms are absent. A similar approach is used for the selection of labels, with the 10 types of syndrome types after the primary and secondary classifications as labels, and if the diagnosticator has a certain syndrome type, the corresponding label is represented as “1”, and if it does not exist, the corresponding label is represented as “0”. That is, in the transformed multi-label data, the features represent the symptoms, and the labels represent the syndrome types.

Each TCM scale is represented as an example of TCM evidence classification, and the TCM scale is treated as an unknown sample, and the corresponding evidence type is described as the label of the unknown sample, respectively, so as to ultimately model the process of TCM diagnosis of Parkinson’s disease, in which the evidence type is inferred from the TCM scale, as a TCM evidence classification problem in which it is predicted to be one of the five evidence types, with reference to a given set of samples of Parkinson’s with a known category.

The transformed dataset for the TCM scales has a number of features. Figure 3 Parkinson’s evidence type frequency plot demonstrates the label imbalance problem. In addition, there is a distinction between primary and secondary evidences between the evidences represented by the labels, thus making the inter-label relationship complex. That is, the transformed Parkinson’s dataset also suffers from the following challenges:

Complex inter-label relationships

Label imbalance

Figure 3.

Frequency of Parkinson’s syndrome type

To address the above problems, the next section attempts to improve the TCM evidence classification algorithm using the topic modeling framework.

Experimental data

The Parkinson’s dataset (parkinson) in this experiment was converted from a real Parkinson’s disease Chinese medicine scale provided by a brain hospital. Table 1 demonstrates the co-occurrence frequency of the labels in the Parkinson’s dataset.

Label co-occurrence frequency table of Parkinson’s data set

Phlegm heat moving wind - main Blood stasis and wind. - main Deficiency of qi and blood. - main Liver and kidney deficiency. - main Yin-yang deficiency - main
Phlegm heat moving wind - Secondary 2 1 13 5
Blood stasis and wind. - Secondary 3 1 6 3
Deficiency of qi and blood. - Secondary 2 3 9 13
Liver and kidney deficiency. - Secondary 7 2 5 2
Yin-yang deficiency - Secondary 1 7 16 2
Experimental results and analysis

Table 2 demonstrates the Top words of the themes obtained from the experiments, with the theme setting of 3, the tags generated by BTM-ML with the highest relevance to the three themes are in theme 0. Liver-kidney insufficiency-principal and phlegm-heat-driven-wind-times are located in the first two places, and from the co-occurrence frequency of the tags shown in Table 2, it can be seen that the tags Liver-kidney insufficiency-principal and phlegm-heat-driven-wind-times co-occur 13 times, which is in the highest co-occurrence, and the other tags Yin-yang double deficiency-principal and Qi and Blood deficiency - times also had 13 times. Liver and Kidney Deficiency - Main and Qi and Blood Deficiency - Times also co-occurred 9 times, which shows that the top labels in Theme 0 have high correlation, which effectively extracts part of the relationship between the labels.

Analyzing the frequency table of co-occurrence of Top words and labels for Theme 1 and Theme 2 shows that yin and yang deficiency-second and qi and blood deficiency-main appeared 16 times, liver and kidney insufficiency-second and phlegm-heat-driven-wind-main appeared a total of 7 times, and yin and yang deficiency-main and qi and blood deficiency-second appeared a total of 13 times, which suggests that Theme 1 and Theme 2 are also effective at representing the relationships among the labels.

In summary, this indicates that the relationships between labels can be effectively extracted and represented using the theme modeling approach. On the other hand, whether it is Theme 0, Theme 1 or Theme 2, the Liver and Kidney Deficiency Certificate is ranked in the top position, indicating that almost all the other certificates labels are related to the Liver and Kidney Deficiency Certificate, and that the Liver and Kidney Deficiency Certificate plays a very crucial role in the field of Parkinson’s in TCM, which is in line with the more common TCM view that Parkinson’s disease is based on Liver and Kidney Deficiency, and Qi and Blood Deficiency.

Top word

Topic 0 probability Topic 1 probability Topic 2 probability
Liver and kidney deficiency. - main 0.3298 Yin-yang deficiency - Secondary 0.4500 Liver and kidney deficiency. - Secondary 0.3318
Phlegm heat moving wind - Secondary 0.2311 Deficiency of qi and blood. - main 0.3000 Phlegm heat moving wind - main 0.1981
Deficiency of qi and blood. - Secondary 0.1918 Liver and kidney deficiency. - main 0.1500 Yin-yang deficiency - main 0.1289
Yin-yang deficiency - main 0.1321 Phlegm heat moving wind - main 0.1250 Deficiency of qi and blood. - Secondary 0.1452

Table 3 shows the results of the classical algorithm and the algorithms modified by LDA-ML, BTM-ML and WNTM-ML on the Parkinson’s dataset.

Experimental results of Parkinson’s data set

Micro-F Macro-F Example-F
LDA-ML(BR) 0.4645 0.2456 0.4171
BTM-ML(BR) 0.4378 0.2087 0.4459
WNTM-ML(BR) 0.4209 0.2179 0.4981
BR 0.4172 0.1789 0.4003
LDA-ML(LP) 0.4098 0.2708 0.4410
BTM-ML(LP) 0.4431 0.2567 0.4507
WNTM-ML(LP) 0.4309 0.2208 0.4878
LP 0.4001 0.2153 0.4027
LDA-ML(ECC) 0.4112 0.2975 0.4268
BTM-ML(ECC) 0.4035 0.2114 0.5624
WNTM-ML(ECC) 0.4892 0.2084 0.4290
ECC 0.4018 0.2041 0.3927
LDA-ML(MLkNN) 0.3290 0.1532 0.1390
BTM-ML(MLkNN) 0.2893 0.0973 0.3082
WNTM(MLkNN) 0.3322 0.2567 0.1240
MLkNN 0.2130 0.0897 0.1211
LDA-ML(CLR) 0.4921 0.2330 0.4134
BTM-ML(CLR) 0.4199 0.2652 0.4903
WNTM-ML(CLR) 0.4933 0.2974 0.4872
CLR 0.4119 0.1976 0.4135
Win/loss 5/0 5/0 5/0

As can be seen from Table 3, the algorithm improved by the topic modeling framework is improved in Micro-F, Macro-F and Example-F evaluation metrics compared to the original algorithm.

Effectiveness of the improved Chinese medicine evidence classification algorithm in Parkinson’s disease evidence-based treatment

Figure 4 shows the ROC curves of the three models of DenseNet121, DAMNet and the improved TCM evidence classification algorithm (TCM), and the area of the ROC curve of the improved TCM evidence classification algorithm model can reach 0.99, which is more than the other two classification models. With the minimum parameter of 7.8M, the improved TCM classification algorithm achieves an accuracy of 96.73%, a precision of 97.45%, a sensitivity of 96.27%, and an F1 score of 96.21%.

Figure 4.

ROC curves of the three models

Thus, it can be verified that the improved TCM evidence classification algorithm is efficient and accurate in diagnosing Parkinson’s disease.

Conclusion

Based on the Bayesian principle, this paper completes the correlation analysis of TCM evidence, captures the key problem of evidence recognition, analyzes the efficiency of the TCM evidence classification algorithm starting from Parkinson’s disease, and proposes the directions it can be optimized and improved.

Compared with the original algorithm, the improved algorithm has improved in the evaluation indexes of Micro-F, Macro-F and Example-F, which indicates that the improved algorithm has better performance in recognizing the Parkinson’s disease dataset after the theme model framework.

Comparing the ROC curves of the improved TCM evidence classification algorithm model with the DenseNet121 and DAMNet models, the improved TCM evidence classification algorithm achieves 96.73% accuracy, 97.45% precision, 96.27% sensitivity, and 96.21% F1 scores with a minimum parameter of 7.8 M. This proves that the improved algorithm with the DenseNet121 and DAMNet models, which proves that the improved TCM evidence classification algorithm is more efficient and accurate in diagnosing Parkinson’s disease.

Lingua:
Inglese
Frequenza di pubblicazione:
1 volte all'anno
Argomenti della rivista:
Scienze biologiche, Scienze della vita, altro, Matematica, Matematica applicata, Matematica generale, Fisica, Fisica, altro