Otwarty dostęp

A Hybrid Deep Learning Algorithm Based Prediction Model for Sustainable Healthcare System

,  oraz   
26 cze 2025

Zacytuj
Pobierz okładkę

Introduction

The massive development of the world’s population requires a well-sustained healthcare system for human beings to maintain their lifetime health. As the population increases, the growth of various diseases can also increase. This creates a crucial task for humans in the healthcare industry: Maintaining a large volume of medical data and predicting an accurate disease at an earlier stage. Since the manual prediction of disease takes a long time, there is a need for an automated healthcare system.

A healthcare system is mainly constructed for the betterment of the health requirements of human beings and provides services like maintaining, restoring, and monitoring the health records of a single person or a group of people. This system comprises the care given by clinics and physicians. It attempts to hold health factors while also providing direct pursuit to improve people’s health. The main significance of the healthcare system is that the value of life can be identified by a single person or corporation as well as by the public or the state. Hence, designing an exhaustive healthcare system provides benefits to a single person or corporation as well as to the public or state.

Moreover, there are various benefits from healthcare systems, but sharing information mutually between patients and doctors in traditional healthcare systems is still complex. Due to the patient’s belief in doctors, the knowledge sharing between them creates information dissymmetry. This makes the healthcare system unsustainable. In order to have a good relationship between the doctor and the patient, the knowledge sharing between them should be mutual, so that the patient can get a better understanding of their health and further follow the doctor’s instructions to cure and improve their health by themselves [1]. For this purpose, an easy and well-organized healthcare model is to be considered. To attain this motive, a predictive method could be the best option, which makes the prediction of disease easier and provides better information sharing between the doctor and the patient [2,3].

The healthcare system is being made digitally centric by using IoT and cloud computing techniques. The Internet of Things (IoT) plays an important role in the health system since it allows doctors to monitor the health level of patients daily from anywhere at any time using IoT sensors. Cloud computing also plays a specific role in storing and maintaining the massive amounts of data in the healthcare system [4]. The execution of digital principles in healthcare systems minimizes the difficulties in the service field and improves effectiveness, which in turn extensively improves the sustainability of healthcare systems.

With the rapid growth of artificial intelligence, the analysis of data in various domain applications becomes simple and efficient. Artificial intelligence utilizes various machine learning approaches to analyze and predict data [5,6]. In particular, the learning approach works much more efficiently than other traditional models in the healthcare system.

The healthcare system requires a better understanding of complex medical data, which is hard to obtain by traditional approaches. To overcome this problem, deep learning techniques have been widely adopted in recent years for medical data analysis [7,8]. Based on this, a reliable convolution neural network (CNN) has been used as a classification model in this work, which classifies medical data. A hybrid CNN model is presented in this work in order to attain a better prediction of disease and to decrease the manpower needs in healthcare data analysis. The main objective of this work is to improve prediction accuracy as a sustainable healthcare system, which should reduce human efforts and increase diagnostic accuracy. The contributions of this research work are summarized as follows:

Presented a disease prediction model for a sustainable healthcare system using hybrid deep learning algorithm which includes a CNN and random forest (RF) techniques.

Presented an intense experimental analysis to exhibit the performance of the proposed method using benchmark datasets.

Presented relative comparative analysis of the proposed model with other conventional prediction models for better validation.

The rest of the section is ordered as follows: A vast survey about different healthcare systems and disease prediction models is presented in Section 2. The proposed prediction model is presented in Section 3, and its performance analysis is presented in Section 4. The final observations are concluded in Section 5.

Related Works

The massive development of the world’s population requires a well-sustained healthcare system for human beings to maintain their lifetime health. As the population increases, the growth of various diseases can also increase. This creates a crucial task for humans in the healthcare industry: maintaining a large volume of medical data and predicting an accurate disease at an earlier stage. Since the manual prediction of disease takes a long time, there is a need for an automated healthcare system.

A healthcare system is mainly constructed for the betterment of the health requirements of human beings and provides services like maintaining, restoring, and monitoring the health records of a single person or a group of people. This system comprises the care given by clinics and physicians. It attempts to hold health factors while also providing a direct pursuit to improving people’s health. The main significance of the healthcare system is that the value of life can be identified by a single person or corporation as well as by the public or the state. Thus, designing a comprehensive healthcare system is good for both individuals and the public or the state as a whole.

Moreover, there are various benefits from healthcare systems, but sharing information mutually between patients and doctors in traditional healthcare systems is still complex. Due to the patient’s belief in doctors, the knowledge sharing between them creates information dissymmetry. This makes the healthcare system unsustainable. In order to have a good relationship between the doctor and the patient, the knowledge sharing between them should be mutual, so that the patient can get a better understanding of their health and further follow the doctor’s instructions to cure and improve their health by themselves [1]. For this purpose, an easy and well-organized healthcare model is to be considered. Using a predictive method could be the best way to reach this goal. This method makes it easier to predict disease and allows for better information sharing between the doctor and the patient [2,3].

The healthcare system is being made digitally centric by using IoT and cloud computing techniques. The Internet of Things (IoT) plays an important role in the health system since it allows doctors to monitor the health level of patients daily from anywhere at any time using IoT sensors. Cloud computing also plays a specific role in storing and maintaining the massive amounts of data in the healthcare system [4]. The execution of digital principles in healthcare systems minimizes the difficulties in the service field and improves effectiveness, which in turn extensively improves the sustainability of healthcare systems.

A wide survey has been conducted to investigate the conventional disease prediction models’ performances. The methodology, feature advantages and disadvantages are considered for investigation, and finally, the drawbacks are discussed to increase the research motivation. Machine learning and data mining methods are broadly used in the medical industry. Using both ML and DM, medical data are trained and classified in the medical industry. A CNN-LSTM method is considered in [9] for predicting heart disease in the healthcare system. A sustainable healthcare system has been constructed in [10] using a triple bottom line and stack holder theory, and the communication between the healthcare systems is provided by means of Fuzzy, DEMATEL, and ISM models. In [11], target prediction of miRNA has been obtained by considering the combination of supervised and unsupervised machine learning algorithms and an SVM classifier. Various types of learning approaches have been compared in [12] to predict the time series blood glucose level of type 1 diabetes disease. IoT is a fascinating improvement in consideration in the domain of health services. IoT makes doctors work smarter by using the collected data in the IoT environment.

The use of tobacco in humans causes chronic diseases like cardiovascular disease, arthritis, cancer, and diabetes. The feature extraction of this disease and the identification of chronic diseases are obtained by using a CNN and K-Nearest Neighbor in [13].

The less consumption of water for more days in the human body causes a kidney problem, which causes kidney damage and, in turn, requires kidney transplantation. To avoid this, early detection of kidney disease is needed, which can be achieved by using machine learning algorithms like Logistic Regression (LR), Decision Tree, and K-Nearest Neighbor [14].

With the huge adoption of Electronic Health Records (EHR) in smart healthcare, a massive quantity of electronic medical data is collected. For the interpretable prediction of those clinical data, the attention mechanism-based neural network is used [15]. The prediction of cardiovascular disease is an extremely difficult task in the healthcare system. This research work [10] compared many machine learning algorithms like REP Tree, M5P Tree, Random Tree, Linear Regression, Naive Bayes, J48, and JRIP and obtained better results for the Random Tree approach. Nowadays, many varieties of brain diseases are created. The early detection of brain disease saves human lives. Various ML and DL approaches are discussed in this work [16] to predict the different types of brain diseases. A risk prediction system for different chronic diseases has been developed in the work [17] using the long-term short-memory (LSTM) algorithm.

The imbalanced class of datasets containing chronic kidney disease data has been balanced by using the SMOTE algorithm [18]. The missing value in clinical data decreases the accuracy of the system. That can be considered the main focus in [19], and the research tried to overcome that problem by using the SMOTE algorithm based LR-KNN system. The abnormal level of microbes in humans can affect human health, which can be predicted by using the random walk algorithm [20]. In [21], the presence and absence of arterial events have been identified for the prediction of inflammatory bowel diseases using SVM and XGBoost algorithms. In [22], the Slime Mould Algorithm (SMA) is utilized in the work to train the SVM for the better classification of COVID-19 data. The classification of Parkinson’s disease is achieved by using the multi-source ensemble learning and CNN method [23]. For predicting Alzheimer’s disease, [24] uses a GAN (Generative Adversarial Network) for the classification of brain images.

From the literature review, it is noted that the accuracy of the prediction depends on the selection of a suitable learning algorithm and classifiers. Various types of machine learning algorithms are mostly used in the research. From the above survey, it is observed that LSTM and SVM algorithms perform better. In some of the cases, CNNs performed better, but the performance was not much better. In order to extend the performance of the CNN-based prediction system, a RF Algorithm is used along with CNN, and it is discussed in the remaining section.

Proposed Work

Nowadays, machine learning approaches have been widely adapted for various fields due to their better performance than other conventional models.

Hence, this paper introduces a CNN algorithm for the feature extraction of medical data, and then it uses a RF algorithm for the classification of diseases based on certain trained conditions. For instance, if the model has to predict the blood pressure level, the input from the IoT sensor of the patient is 142 mm, and then the model has to predict the output as hypertension. To attain this, the model has to be trained with corresponding conditions. For the proposed healthcare system, a CNN is used along with a Random Forest classifier. The beginning of the process is the collection of medical data. The collected data are to be normalized before being given to CNN. In CNN, the normalized data undergoes a learning algorithm for training, which gives an optimized result. The optimized result is fed to the RF classifier for data classification. The process flow of the proposed model is depicted in Figure 1. An IoT sensor is used as an input device, which will detect and take the sensed medical data from the patient as input. This data will be normalized and fed as input to the proposed CNN. According to the machine learning process, the model starts to learn and categorize the input data depending on the RF algorithm used in the proposed work. This will finally detect and give the desired output as a predicted disease.

Figure 1.

Process flow of the proposed model

The mathematical model for the proposed work is initialized with the data normalization process. Nonstandardized input data take more time for the learning process. Hence, before giving the medical data inputs for training and testing into the model, the data have to be normalized and it is given by the following expression f(M)=x+yMmin(Mi)max(Mi)min(Mi) where Mi defines the input medical dataset of M, x and y define the constant of the normalizer. These normalized input data are further given as input to the training and testing in the model. CNN provides an efficient feature extraction of the input data when compared with other traditional models since it contains two special layers called the convolution layer and the pooling layer. Feature extraction of the input data is done by convolution layer through convolution operation. Consider the input medical data from the IoT being M and Gi being used as a feature map (G0 = M) of the convolution layer Gi. The convolution process is given by Gi=f(Gi1Vi+Bi)

Where, Vi is the convolution weight vector, “⊗” is the convolution operator, i − 1 is the feature map of data, and Bi defines the bias vector. After the process of convolution, the result of the convolution layer is given to the pooling layer which conducts the pooling operation by Max Pooling method in the matrix form. By considering Gi as the down sampling layer and it is given as Gi=subsampling(Gi1)

After the pooling process is over, the new feature map expression of G0 is obtained as Q(i)=R(C=ciG0;(V,B))

The new feature map output is given as input to the Fully Connected layer of CNNs, which flattens the input a fixed length vector form. Instead of a softmax layer in CNNs, in this work, a RF classifier is used to achieve better classification. The output from the FC layer of CNN is fed as input to the RF classifier.

RF is a type of data classifier that has many trees on different subgroups of the input dataset and predicts the accurate output by taking the average of the given dataset. By increasing the number of trees, i.e., subgroups, the accuracy of the output will improve. For instance, if a dataset containing muliple flower images is given to the RF classifier, this dataset will be separated into various subgroups and given to the corresponding decision tree. At the data training stage, every decision tree obtains a predicted output, and if new data exist, then depending on the maximum vote of subgroup outputs the RF classifier will predict the result as a final decision. Whereas decision tree classifier uses a single decision tree since the prediction accuracy will be lesser when compared to the RF classifier. Due to the usage of multiple decision trees, the RF algorithm takes less time for training the data and predicts efficient accurate output for a larger input dataset than other traditional models. The predictive accuracy depends on the knowledge attained during the training stage. Thus, it indirectly indicates that using more decision trees for training purposes provides an accurate prediction. The RF algorithm uses three steps for predicting the output. First is the selection of the training dataset; second is the construction of the RF model; and third is the prediction of result by the majority voting method.

The RF approach is used to produce a similarity matrix by measuring the similarity of data instances. For the given features M from the output of CNN, ‘S’ number of trees are constructed as RF in the form of M × N matrix which is given by Q = {Qij} where i = {1, 2, 3,…M} and j = {1, 2, 3,…N}. After the tree growth is developed, the dataset is given to the tree for the classification and prediction of data.

If the instances i and j are presented in the same leaf of node, then it will increase the similarity by 1 (i.e., Qij and Qij improve by 1 simultaneously). This process is repeated until all the ‘S’ trees of the model developed well and the respective matrix is obtained. The correlation matrix is obtained by dividing Qij by the total number of trees (S). The matrix which is having each element of the main diagonal as ‘1’ is called the similarity matrix. This can be calculated using the degree of similarity between the instances. Consider two samples from the input dataset as a and b and the similarity matrix formed between them as {pred (a, b)}. The data of the RF approach are constructed using the concept of multidimensional scaling. So, the matrix on the first coordinate of pred (a,b) will be pred (–, b), the matrix on the second coordinate of pred (a, b) will be pred (a, –) and the matrix on two coordinates of pred (a, b) will be pred (–, –). From this, the matrix formed is given as CV(a,b)=0.5×(pred(a,b)pred(a,)pred(,b)+pred(,))

Let β(j) be the eigen value of the matrix CV(a,b) and Vj(a) be the eigen vector. From this, vector formed is given by x(a)=β(1),V1(a),β(12),V2(a),, )

The square root of the distance between two samples is calculated and its value is equal to 1 – pred (a, b). The value off β(1),V1(a) is the value of vector x(a) on the jth scaling coordinate. The main aim of process scaling is to calculate vector x (a) through the first few scaling coordinates. To achieve this, the RF algorithm extracts several maximum eigen values and respective eigen vectors from the matrix CV. If the feature input data have unusual samples, then the measurement process for those data are obtained by using the equation U(a)=d(b)=jpred2(a,b)

From this, the real unusual sample measurement of sample a is C=asampleU(a)

Where ‘a’ sample represents the number of samples of medical data, category j of the listed disease. For all the unusual samples, the process is mathematically described using mean and standard deviation as follows: C(s)=Cμσ

The last unusual sample measurement value is referred to as an unusual degree. This can be obtained by normalizing the initial unusual sample measurement of medical data of the listed disease. For simple understanding, a RF-based approach to the feature classification of medical data is given by Z(X)=argpmax=j=1swjB(zj(x)=P) wj=2CPj2CPj+IPj+INj where Z(x) denotes the joint classification model determined by the weighted RF algorithm, wj is the single decision tree (i.e., sub classifier), P denotes the classification type (i.e., output variables), and function B(.) denotes the decision function. The output variable P has two options, one is positive and the other is negative. If P presents as positive, then the weighted value of all subclassifiers will be classified as abnormal status of the patient and this will be collectively summed as the rank of Z(x). On the other hand, if P presents as negative, then the weighted value of all subclassifiers will be classified as the normal status of the patient and this will be collectively summed as the rank of Z(x). The comparision of the two ranks and the corresponding value of P of the maximum voting of the two ranks denotes the predicted classification result of the proposed model.

Pseudocode for the proposed hybrid deep learning algorithm
Input: Normalized Medical data (A)
Output: Predicted Result(P)
Start
Determine the convolution process by
Gi = f(Gi−1Vi + Bi)
Obtain the feature extraction of input medical data by
pooling layer as Q(i)
Reduce the dimension of feature map by FC layer in the form
of fixed vector length
FC output Q(i) is given as input to RF classifier.
Retrieve the training dataset S from the input dataset of RF
classifier by bootstrap method
Obtain S sub classifier(or) decision tree for S dataset
Calculate the values of TP, FP, FN, TN, CPR, CPV for all sub
classifiers.
Obtain the values of CAE, CSRE and R2Measure.
Obtain the weight of the classifier as Wi
Feed the system with test dataset to estimate the performance.
Feed the unclassified sample and classify them according to F-rank weighted RF
Determine the final classification result of sub classifier j as
zj(x)
Calculate the rank of Z(x) from the output variable value P
If P is in the positive class, add one rank in positive class of Z(x)
Else add one rank in negative class of Z(x)
Compare two ranks and predict the majority vote as final result
If majority voting of Z(x) is in the positive class, then predict the result as Abnormal
Else predict the result as Normal
End
Results and Discussion

The experiment is conducted to validate the proposed CNN-RF prediction model using Java Abstract Window Toolkit installed in an intel i3 processor 2.20GHZ frequency with 8GB memory. The proportion metrics such as CPR, IPR, CPV, Accuracy, and F-Rank, and the quality indices such as CAE, CSRE, and the preferability index R2 Measure are used to verify the performance of the proposed model. Two datasets are utilized to train the proposed model. The first dataset is collected from the Cardiology Department of the Chinese PLA General Hospital [15]. The feature includes demographics, vital signs, lab tests, echocardiography, comorbidities, length of stay, and medications. Total 105 features are obtained from each patient and a total of 736 patient data are used in the dataset. The second dataset is collected from the UCI machine learning repository [26]. The dataset includes 76 features which include demographics, vital signs, cholesterol, echocardiography, and medications.

The performance of the proposed work is determined in terms of proportion metrices, quality metrices and the prefarability metrices.

Propotion Metrics

This method uses proportion metrics such as Accuracy, Correct Positive Rate (CPR), Incorrect Positive Rate (IPR), and Positive Predicted value, Accuracy, and F-Rank and these are determined by using a confusion matrix form. To calculate these metrices, the confusion matrix is used which is shown in Table 1. The column of the matrix denotes the prediction class and the addition of the column value denotes the examination of input data in the class. The row of the matrix represents the actual class, and the addition of the row value denotes the examination of data in the corresponding class. The main aim of this work is to check whether the input medical data contains disease or not using binary classification of the RF classifier. The Confusion matrix of the proposed model is presented in the form of the following Table 1.

Confusion matrix

Actual Prediction
Diseased Non-Diseased
Diseased TP FN
Non-Diseased FP TN

The medical data containing disease are set as the Correct class and the the medical data containing no disease are set as the incorrect class. In Table 1, TP means the actual medical data containing disease are correctly predicted as diseased patient, FN means the actual medical data containing disease are incorrectly predicted as non diseased patient, FP means the actual medical data containing no disease are incorrectly predicted as diseased patient, and TN means the actual medical data containing no disease are correctly predicted as non-diseased data. The CPR or recall is given by the following equation CPR(or)recall=TPTP+FN

The IPR or False-positive rate (FPR) is given by the following equation IPR(or)FPR=FPFP+TN

The correct prediction value (CPV) is given by the following equation CPV(or)Precision=TPTP+FP

The accuracy of the proposed model is given by the following equation Accuracy=TP+TNTP+TN+FP+FN

However, the RF method has many advantages, the prediction of unbalanced medical data is harder to obtain. In order to overcome this, this work creates a weighted F-rank into the RF model, which provides efficient performance for disease prediction by allocating distinct weight to each decision tree. F-rank is obtained by combining the result of the CPR and CPV. The performance of the classification is improved with the large value of the F1-score. The F1-score for the proposed model is given by F1score=2TP2TP+FP+FN

The final decision result can be given by the following equations wj=2TPj2TPj+FPj+FNj Z(X)=argpmax=j=1swjB(zj(x)=P)

Where, Z(x) denotes the joint classification model determined by the weighted RF algorithm. wj is the single decision tree (i.e., sub classifier), P denotes the classification type (i.e., output variables), and function B(⋅) denotes the decision function. The output variable P has two options one is positive and the other is negative. If P presents as positive, then the weighted value of all subclassifiers will be classified as abnormal status of the patient and this will be collectively summed as the rank of Z(X).

On the other side, if P presents as negative, then the weighted value of all subclassifiers will be classified as normal status of the patient and this will be collectively summed as the rank of Z(X).The comparision of the two ranks and the corresponding value of P of the maximum voting of the two ranks denotes the predicted classification result of the proposed model.

Quality Metrices

In order to identify the quality of the proposed model, the Comparative Absolute Error and Comparative Square Root Error and R2 Measure is considered and obtained.

The Comparative Absolute Error is given by the following equation CAE=i=1s| PiP^i |

The Comparative Square Root Error is given by the following equation CSRE=1si=1s(PiP^i)2

Where S is the total number of datasets, Pi is the orginal value, and P^i is the predicted value.

Model Preferability Metrics

The preferability of the model is obtained by calculating the value of R2 Measure. R2 Measure is a kind of statistical measure which finds the alteration quantity in the dependent variable which is determined by the independent variable. R2Measure=1RASTAS

Where RAS defines the Residual addition of squares which gives the average of squared error between original value P and predicted value P and TAS defines the total addition of squares which gives the total of the squared error between original value P and the average of all P. The ranges of R2 measure lies between 0 and 1. The preferability of the model is decided by this range. If the value of R2 measure is closer or equal to one, then the model is considered as most preferable for the input medical data. Otherwise, if the value of R2 measure is negative, then the model is not prefarable for the input medical data. Table 2 depicts the performance metrics of proposed model for dataset 1 and dataset 2.

Performance metrics of proposed model

S.No Performance metrics Dataset 1 Dataset 2
1. Recall 0.987 0.986
2. False positive rate (FPR) 0.02 0.03
3. Precision 0.979 0.984
4. F1-score 0.983 0.912
5. CAE 0.04 0.05
6. CSRE 0.02 0.018
7. R2 measure 0.98 0.975
8. Data Training Time 7.8 6.4
9. Accuracy 0.968 0.978

Furthermore, the performance of the proposed model has been compared with existing techniques evaluated in Chen et al.’s [15] research for dataset 1 and Sudarshan et al.’s [26] research for dataset 2. For data set 1, techniques like stacked denoising autoencoder (SDAE), LR, MLP, MLP with attention mechanism (MLP-A) and multi-neural networks (MNNs) are used for comparison with the proposed model.

For dataset 2, techniques like support vector machines (SVMs), LR, RF, swarm artificial neural network (S-ANN) and MNNs are used for comparison with the proposed model. Figures 2 and 3 depict the precision analysis of the proposed model and existing models for dataset 1 and dataset 2, respectively. From the results, it is visible that the proposed model exhibits maximum precision which indicates the classification performance of the proposed model has been increased due to the efficient feature selection and processing using the RF classifier. Similarly, for dataset 2 the maximum performance is attained by the proposed model whereas existing methods attains minimum precision values compared to the proposed model. The average precision value attained by the proposed model for dataset 1 is 0.979 and for dataset 2 the obtained precision is 0.984 which is much better than the existing techniques.

Figure 2.

Precision analysis for dataset 1

Figure 3.

Precision analysis for dataset 2

The recall metrics of the proposed model and existing models for dataset 1 and dataset 2 have been depicted in Figures 4 and 5, respectively. Results demonstrate that the maximum recall obtained by the proposed model for both datasets. Though the performance of MNN is much better than other existing techniques, it is less than the proposed hybrid deep learning algorithm. The average recall value attained by the proposed model for dataset 1 is 0.987 and for dataset 2 the obtained recall value is 0.986.

Figure 4.

Recall analysis for dataset 1

Figure 5.

Recall analysis for dataset 2

The F1-score analysis for the proposed model and existing models are comparatively presented in Figures 4 and 5 for dataset 1 and dataset 2, respectively.

Based on the recall and precision values, the F1-score has been obtained and presented. From the results, it is visible that the maximum score is attained by the proposed model compared to existing techniques. The average F1-score attained by the proposed model for dataset 1 is 0.983 and 0.912 for dataset 2. The score attained by the MNN for dataset 1 is 0.961, which is 2% lesser than the proposed model, and for dataset 2 the attained F1-score is 0.90, which is 1% lesser than the proposed model.

The accuracy of the proposed model and existing models are comparatively analyzed and depicted in Figures 8 and 9 for dataset 1 and dataset 2, respectively. It can be observed from the results that the maximum accuracy is exhibited by the proposed model for both datasets whereas the performances of existing models are lesser than the proposed model accuracy values. The maximum accuracy attained by the proposed model is 0.968 for dataset 1 and 0.978 for dataset 2. The optimal feature selection using deep convolution layers and classified using RF increases the prediction accuracy of the proposed model. Whereas existing techniques lags in performances due to the improper feature selection and classification process.

Figure 6.

F1-score analysis for dataset 1

Figure 7.

F1-score analysis for dataset 2

Figure 8.

Accuracy analysis for dataset 1

Figure 9.

Accuracy analysis for dataset 2

Table 3 depicts the performance comparative analysis of the proposed model and existing models in terms of accuracy, recall, and precision.

Performance comparative analysis

S.No Method Accuracy Precision Recall
1 Stacked denoising auto-encoder (SDAE) 0.623 0.670 0.782
2 Logistic regression (LR) 0.655 0.700 0.792
3 MLP 0.651 0.692 0.799
4 MLP with attention mechanism 0.667 0.710 0.795
5 SVM 0.87 0.85 0.85
6 Logistic regression (LR) 0.86 0.84 0.85
7 Random forest (RF) 0.89 0.88 0.88
8 Swarm-ANN 0.957 0.952 0.952
9 MNN 0.966 0.962 0.97
10 Proposed CNN-RF 0.973 0.982 0.987

The average values from the results of dataset 1 and dataset 2 are presented in the tabulation. It can be observed from the results the performance of proposed model is much better than the existing methodologies. Thus, it is visible that the proposed model can be utilized in the medical domain as a sustainable healthcare data analysis system.

Conclusion

A hybrid deep learning architecture for a sustainable healthcare data analysis system is presented in this research work using CNNs and a RF algorithm. The proposed architecture utilizes the features extracted from convolution layers and classify the data using a RF classifier instead of fully connected neural network model which is present in the conventional CNN architecture. The novelty in the architecture enhances the classification performance of the data analysis system compared to the conventional CNN model. Standard healthcare datasets are used for experimentation and verified through performance metrics like accuracy, recall, precision, F1-score and other network parameters. To demonstrate the better performance, existing methodologies like SDAE, LR, MLP, MLP-A, SVM, RF, S-ANN, and MNNs are compared with the proposed hybrid model. Experimental results depict that the performance of the proposed model is much better than the existing techniques. Though the performance of the proposed model is better, the minimum error or false predictions reduces the accuracy, which is considered as a minor limitation of the proposed model. Furthermore, this research work can be extended using multi deep learning networks to improve the accuracy.