Performance evaluation of college laboratories based on fusion of decision tree and BP neural network

College laboratories undertake the teaching task of cultivating practical and scientific innovation ability of the students. The aim of performance evaluation is to evaluate the outcomes of the laboratories, find out both the strength and the weakness of the management and provide the administrators suggestions on continuous improvement. Performance evaluation should be considered as a very important work since it can strengthen the construction and management of the laboratories. Therefore, it is necessary to take into account a scientific evaluation method during the process of the performance evaluation.

Most of the traditional evaluation methods rely on expert’s opinions; they could more likely to be influenced by some subjective factors. In order to overcome the shortcomings of the traditional methods as well as to improve the effectiveness and accuracy, scholars attempt to introduce soft computing techniques to evaluate the performance [1, 2]. These methods mainly focus on the mathematical algorithm to quantify the task of performance evaluation. Generally speaking, they are more scientific and standardised than traditional methods. However, the existing methods are not systematic, considering the subjective factors that hard to be quantified.

A decision tree is an inductive learning technique with high speed and high veracity in prediction of classification. The BP neural network is an adaptive system which can effectively reduce the influence of subjective factors and enhance the objectivity of the evaluation results. The decision tree has advantages of clear structure and simple rules of classification, which can be used to overcome the shortcomings of the BP neural network such as slow convergence and poor interpretability. In such respect, it is recommended to take into account the method which combined the decision tree and neural network on performance evaluation of college laboratories. However, there are few research studies about this subject.

Motivated by the above consideration, a performance evaluation method based on the fusion of the decision tree and BP neural network is proposed in this paper. The decision tree is used to select performance evaluation indexes with high weight. The BP neural network was adopted aiming to reduce the impact of assessment prediction of classification by non-core factors.

The paper is structured as follows. Section 2 describes the related works on the subject. Brief details of the decision three and BP neural network are provided in Section 3. The performance evaluation method based on the fusion of the decision tree and BP neural network is considered in Section 4. A short comparative analysis of three models is given in Section 5. Section 6 describes a summary of conclusions.

Related works

The concept of performance evaluation was originally proposed by American management scientist Aubrey Daniels in 1970 [3]. Until now, scholars have proposed various methods based on different soft computing techniques, which are aimed to improve the effectiveness and accuracy of the performance evaluation. Recently, widely used assessment methods are fuzzy logic [4], fuzzy comprehensive evaluation (FCE) [5], analytic hierarchy process (AHP) [6,7], entropy weight fuzzy model [8,9], data envelopment analysis (DEA) [10,11], balanced scorecard [12, 13], decision tree [14] and BP network [15].

Guo et al. [16] constructed a multi-judgement FCE model to deal with the situation when experts have more than one choice of the evaluation. Roberti et al. [17] used AHP to quantify conservation compatibility to finding and comparing optimal retrofits for historic buildings. Jia et al. [18] used the AHP model to calculate the index weight and established performance evaluation model of relevant personnel, which solved the problems that qualitative performance evaluation had to deal with. Jing et al. [19] constructed the FCE model of laboratory safety in colleges and universities based on AHP. Wu et al. [20] established a safety evaluation index system determined by expert consultation and AHP, and the evaluation was carried out by the FCE method. Shunling et al. [21] selected FCE based on the entropy weight method to evaluate the utilisation of the instrument. Li et al. [22] proposed a comprehensive evaluation method of laboratories based on entropy weight fuzzy matter-element model. Someh et al. [23] proposed an evaluation method to measure the efficiency and ranking of medical diagnostic laboratories by applying a network DEA. Weipeng et al. [24] established a data index system of input and output of the laboratory, using DEA to make a comprehensive evaluation. Lu et al. [25] constructed a performance evaluation model based on scorecard with entropy restraint.

These approaches mentioned above were verified by experimental data, and they performed better than traditional evaluation methods. However, they are all restrictive because they often result in conflicting conclusions of efficiency due to unsuitability of the assumptions [26]. There is still room for improvement on reducing the influence of subjective factors and decreasing the uncertainty outcomes.

The decision tree technique is an inductive learning technique, which can infer classification from a set of random data samples. The decision tree algorithm, for instance, CART [27] and ID3 [28] utilises predicted classifications to label the leaves in the process of constructing the decision tree. Some further research studies have indicated that the decision tree models run quickly during the process of training data. As a result, the decision tree is often applied to constructing an evaluation model. For instance, Budiman et al. [29] conducted study on the student academic evaluation technique by using Decision Tree C4.5.

With powerful storage and self-adaptive learning capability, the BP neural network can process nonlinear data. The BP neural network works by incremental changing weights in a network that consists of elements called neurons [30]. Nevertheless, the BP neural network algorithms bring lots of noisy data, although it almost always achieves better performance in classifying novel examples. Now, the BP neural network has been applied in various fields, such as evaluation of campus safety, laboratory management, the quality of teaching and service of laboratory, etc. Zhang, Shi, Li, Lu et al. [31, 32, 33, 34] set up an evaluation model of laboratory in colleges based on the BP neural network.

Both decision tree and the BP neural network algorithm have their own shortcomings in constructing evaluation models. If only decision tree is used, it will make the error bigger and bigger, with the depth of the decision tree increasing. If using only the BP neural network is used, the choice of network structure depends mainly on subjective experience due to lacking theoretical guidance. In order to complement each other’s advantages, it is necessary to combine the two algorithms to obtain an effective and accurate performance evaluation method. Now, the methodologies that combine the two modes have been applied to various areas, such as teacher’s performance evaluation, ship collision risk assessment [35], earthquake infrasonic wave detection [36] and marketing investigation [37]. However, none of the mentioned papers focus on performance evaluation of college laboratories. To make the performance evaluation more effective and accurate, we proposed a method that combined the decision tree and neural network. The decision tree was used to reducing the dimension of neurons in the input layer, which can increase the convergence speed and improve prediction precision of the BP neural network.

Decision three and BP neural network

3.1

The decision tree algorithm

3.1.1

Introduction to the decision tree

The decision tree consists of decision-making nodes, state nodes, probability branches and terminal nodes [38]. In a general way, the decision tree is often constructed recursively from the top to bottom. This procedure is divided into three steps: splitting nodes, choosing which nodes are terminal nodes and assigning class labels to each terminal node.

The graphics of the decision tree is painted like tree branches according to different branches of a decision. As an inductive learning algorithm, the decision tree can classify the irregular data and show them in the forms of classification rules.

At present, widely used decision tree generation algorithms are ID3, C4.5 and C5.0. C4.5 algorithm is more efficient than others because it can achieve higher recognition rate with less nodes. In this paper, the decision tree was generated by C4.5 algorithm.

3.1.2

Information gain ratio

The pivotal part of the decision tree is choosing the proper test attribute to splitting nodes. Information gain ratio is used as the criterion of classification in decision tree C4.5 algorithm.

Let S stand for the current training sample set and the classification attribute A among S have m independent values. That means training sample set S is classified into m classifications, marked as C_i(i = 1,2,3,...m). |S| is the number of samples among set S, and r_i is the number of samples belonging to classification C_i. Then, the entropy of different classifications among training sample set S can be expressed as follows: (1) $I (r_{1}, r_{2}, \dots, r_{n}) = - \sum_{i = 1}^{n} p_{i} {log}_{2} (p_{i}), p_{i} = \frac{r_{i}}{| S |}$ I({r_1},{r_2},{\kern 1pt} \ldots,{r_n}) = - \sum\limits_{i = 1}^n {p_i}{\log}_2 ({p_i}),\;{p_i} = {{{r_i}} \over {\left| S \right|}}

If the attribute A has n different values (a₁,a₂, ..., a_n), the set S can be classified into n subsets by different values of the attribute A, marked as S_j(j = 1,2, ..., n). Thus, values of the subsets among S_j are all with the same value in attribute A. Let S_ij represent the subset of sample set S_j which belongs to attribute i(i = 1,2, ..., m). The nondeterminacy of S_ij can be described by the expected value of information T_ij. (2) $T_{ij} = - p_{ij} {log}_{2} (p_{ij}), p_{ij} = \frac{s_{ij}}{| s_{j} |}$ {T_{ij}} = - {p_{ij}}{\log}_2 ({p_{ij}}),{p_{ij}} = {{{s_{ij}}} \over {\left| {{s_j}} \right|}}

In formula (2), for each training set, S_j : p_1j + p_2j + ⋯ + p_mj = 1. The entropy of subset with the value of a_j is marked as I(s_1j, s_2j,...,s_mj). (3) $I (s_{1 j}, s_{2 j}, \dots, s_{mj}) = - \sum_{i = 1}^{m} p_{ij} log (p_{j})$ I({s_{1j}},{s_{2j}}, \ldots,{s_{mj}}) = - \sum\limits_{i = 1}^m {p_{ij}}\log ({p_j})

Information gain Gain(A) is described as the difference of the entropy between before and after the splitting of attribute A. (4) $Gain (A) = I (r_{1}, r_{2}, \dots, r_{n}) - E (A)$ Gain(A) = I({r_1},{r_2}, \ldots,{r_n}) - E(A)

In formula (4), E(A) represents the expected value of classification C_i(i = 1,2,...,m) of attribute A. (5) $E (A) = - \sum_{i = 1}^{m} w_{j} I (s_{1 j}, s_{2 j,} \dots, s_{mj}), w_{j} = \frac{(s_{1 j} + s_{2 j} + \dots + s_{mj})}{| S |}$ E(A) = - \sum\limits_{i = 1}^m {w_j}I({s_{1j}},{s_{2j,}} \ldots,{s_{mj}}),{w_j} = {{({s_{1j}} + {s_{2j}} + \cdots + {s_{mj}})} \over {\left| S \right|}}

The gain ratio Ratio(A) is given as follows: (6) $Ratio (A) = Gain (A) / Split (A)$ Ratio(A) = Gain(A)/Split(A)

In formula (6), Split(A) represents split information which can be calculated as follows: (7) $Split (A) = - \sum_{i = 1}^{m} p_{i}^{'} {log}_{2} (p_{i}^{'})$ Split(A) = - \sum\limits_{i = 1}^m p_i^{'}{\log}_2 (p_i^{'})

In formula (7), if attribute A is the classification criterion of decision tree, Split(A) is the entropy of training samples.

3.1.3

Decision tree generation algorithms

In the process of constructing a decision tree, the training data are always divided into subsets that contain only one single class. Consequences follow from this; it will make the tree larger and more complex, and it may over-fit to the noise in the training data. Therefore, it is necessary to prune the tree, aiming to decrease the complexity without reducing the classification accuracy. Also, it will make the decision tree more effective and the training speed much faster.

The basic pruning strategies are divided two types: into pre-pruning and post-pruning. Pre-pruning is to evaluate each node before splitting during the construction of the decision tree. If splitting fails to improve the generalisation capability of the decision tree, the splitting will be stopped, and this node will be labelled as leaf node. Post-pruning is done after the decision tree is generated. If generalisation capability is improved when a non-leaf node is replaced to leaf nodes, then the node will be replaced. In general, the under-fitting risk of post-pruning decision trees is very small, and the generalisation performance of post-pruning decision trees is usually better than that of pre-pruning decision trees. Reduced error pruning (REP) [39] is one of the post-pruning methods, which is adopted in this paper.

3.2

The BP neural network

3.2.1

Introduction to the BP neural network

The BP neural network is a multi-layer feed forward neural network that trains data in light of error reverse propagation algorithms. The structure of the BP neural network consists of one input layer, at least one hidden layer and one output layer. There is a great sum of neurons which connect to each other by weight at the same layer. However, there are no connections between neurons at different layers. The computational process of neural network involves two stages: forward propagation and back propagation. In the stage of forward propagation, input information is carried from the input layer to hidden layer or from the hidden layer to output layer. In this way, the nonlinear processing is accomplished by applying an active function to the summed inputs to each neuron. The back propagation will start if the desired output cannot be achieved at the output layer. The network continuously adjusts the connection weights and threshold according to the error signal, with the result that the error of the output is getting closer and closer to the requirement of the precision [40].

3.2.2

The construction of the BP neural network

The construction of the BP neural network is a three-step process: network initialisation, forward propagation and back propagation. The detailed steps of the algorithm are described as follows:

Step 1: Entering the learning samples (X_i, Y_i) (i = 1,2,...,n), where X_i and Y_i stand for the input vector of the learning samples and output vector, respectively.

Step 2: Confirming the number of neurons in each layer, establishing the connection weight matrix between two layers. $M^{0} = [m_{ij}^{N}]$ {M^0} = \left[ {m_{ij}^N} \right] , where M⁰ is the weight matrix between the layer 1 and layers (L + 1). $m_{ij}^{N}$ m_{ij}^N is output node at each layer.

Step 3: Calculating the output of each node $\hat{Y} = f (\sum_{i = 1}^{L + 1} m_{ij}^{N} I^{N} + T_{i})$ \hat Y = f(\sum\nolimits_{i = 1}^{L + 1} m_{ij}^N{I^N} + {T_i}) , where I^N and T_i are input layer and threshold value, respectively, at each layer.

Step 4: Comparing the mean square error of each input nodes. (8) $MSE = \frac{1}{n} \sum_{i = 1}^{n} {(Y^{N} - {\hat{Y}}^{0})}^{2}$ {\rm{MSE}} = {1 \over n}\sum\limits_{i = 1}^n {({Y^N} - {\hat Y^0})^2}

Step 5: Stop calculating when the MSE meets the forecast error ɛ; otherwise, go to step 6.

Step 6: Adjusting the connection weight matrix from layer 1 to layer (L + 1): (9) $δ^{0} = - {(Y^{N} - \hat{Y})}^{2} f (I^{N}), Δ M^{o} = η σ^{o} I^{N}, M^{0} = Δ M^{0} + M^{0}$ {\delta ^0} = - {({Y^N} - \hat Y)^2}f({I^N}),\Delta {M^o} = \eta {\sigma ^o}{I^N},{M^0} = \Delta {M^0} + {M^0}

Step 7: Go to step 4; stop calculating when MSE meets the desired value.

Performance evaluation for college laboratories

4.1

Performance evaluation method

Generally speaking, the convergence rate of the neural network will slow down if the number of input attributes is too large, and it will give more chances to overfitting. Therefore, the input attributes must be reduced before they are input in the BP neural network. The C4.5 decision tree algorithm was used for choosing the optimal attribute set with high weight on the basis of calculating the information gain rate. Then, those attributes are regarded as input nodes during the construction of neural networks. When the network is well trained though as many as training samples, the prediction network will be obtained. After being tested and adjusted, the prediction network could be used to predict the result of the performance evaluation.

The performance evaluation method is shown in Figure 1.

Procedure of performance evaluation for laboratories with the use of the decision tree and BP neural network.

The performance evaluation algorithm is described as follows:

Step 1: Determine the evaluation indexes.

Step 2: Pre-processing the evaluation indexes by the fuzzy mathematics theory, aimed to make sure that the values are in the scope [0, 1].

Step 2: Generate a decision tree using C4.5 algorithm.

Step 3: Select the evaluation indexes with high weight by decision tree.

Step 4: Determine the optimized BP neural network structure described as Section 3.2.2. And then, training the network. Evaluate indexes that obtained from step 3 and evaluation results from expert consultation are input and output of the network respectively.

Step 5: Test the network; check whether the network achieved the expected accuracy. Stop training if the expected error or the maximum number of iterations meet the desired value.

After that, the results of performance evaluation will be output if evaluation indexes are input into the BP neural network.

4.2

Evaluation system of college laboratories

The reasonable evaluation system is the prerequisite of a scientific evaluation method. After the analysis of ISO 9000 family of standards, and combing with many years’ experience in performance evaluation, we established the evaluation system consists of six primary indexes and 24 secondly indexes. The valuation system is described in Table 1.

Table 1

Evaluation system of college laboratories.

The primary indexes	The secondly indexes
Construction	1. Area and the environment 2. Instruments and equipment 3. Operation and maintenance 4. System and management
Laboratory team building	5. Tutors of experiment 6. Laboratory team construction 7. Personnel structure 8. Appraisal mechanism 9. Training mechanism
Experimental teaching	10. Practice ability 11. Exam of experiment 12. Report of experiment 13. Comprehensive and designed experiments
Administration system	14. System and management 15. Management tool 16. Experiment teaching material 17. Service efficiency
Laboratory safety	18. Safety measures 19. Hazmat management 20. Experimental environment protection 21. Clean and tidy
Innovation and entrepreneurship	22. Personnel structure proportions 23. Innovative entrepreneurship 24. Experiment project for college student

4.3

Pre-process the data

The data we used came from our previous research. A great deal of data has been accumulated in performance evaluation of college laboratories in our previous work. Each evaluation index is expressed as a score in the range of 0 and 100. There are two kinds of evaluation indexes, namely qualitative index and quantitative index. Qualitative index scores can be given directly according to the scoring criteria. Quantitative index scores are given by experts according to the specific conditions and performance of the laboratory. After the quantitative index scores of laboratories have been done, the expert opinions of each index are processed by the arithmetic mean method. (10) $q_{n} = \sum_{s = 1}^{T} S_{ns} / T$ {q_n} = \sum\limits_{s = 1}^T {S_{ns}}/T where q_n is the score of index n given by expert and S_ns is the score of index n given by expert s. s is expert serial number, and T is number of questionnaires.

Before being processed by the decision tree, the score must be pre-processed, aimed to make it range from 0 to 1. The scores of the indexes are shown in Table 2; laboratory numbers are in the first column; the scores of indexes are in rest columns.

Table 2

The scores of the indexes.

Laboratory number	Index 1	Index 2	Index 3	Index 4	Index 5

1	72	84	76	72	75.6
2	77	78	85	85	81.2
3	86	94	75	86	84.3
4	87	71	82	82	81.4
5	82	63	88	76.5	80.2
6	71	83	89	80	80.6
7	85	94	92	88.5	89.6
8	81	60	68	74.5	71.6
9	79	55	74	66.5	70.2
10	91	76	90	89	87.3

For the sake of simplicity of computation, fuzzy mathematics theory was used to pre-process the data. In this case, the scores of evaluation results can be converted to a grade system. For example, if scores >85 are defined as ‘excellent (A)’, scores between 75 and 84 are defined as ‘good (B)’, scores between 75 and 84 are defined as ‘general (C)’ and scores <64 are defined as ‘poor (D)’. As a result, the scores in Table 2 can be converted to the grade system, which are shown in Table 3.

Table 3

Grade system of evaluation indexes.

Laboratory number	Index 1	Index 2	Index 3	Index 4	Index 5

1	C	B	B	C	B
2	B	B	A	A	B
3	A	A	B	A	B
4	A	C	B	B	B
5	B	D	A	B	B
6	C	B	A	B	B
7	A	A	A	A	A
8	B	D	C	B	C
9	B	D	C	C	C
10	A	B	A	A	A

However, it is difficult to reflect the continuity of the changing processes mentioned in Table 3. For example, when the score is 74, it was converted to ‘C’ as it is <75. If the score is 75, it will be converted to a ‘B’. It is likely to produce misunderstanding that one point difference may lead to different grades. Therefore, we must blur original data. Considering the numerical characteristics, we use trapezoidal membership functions to calculate the membership degrees.

Trapezoidal fuzzy function is a fuzzy distribution function which refers to the probability theory aimed at making the evaluation more scientific [41]. Assume ‘Excellent (A)’, ‘Good (B)’, ‘General (C)’ and ‘Poor (D)’ is a set of fuzzy terms, respectively, which is described by trapezoidal fuzzy function. The forms of trapezoidal membership function in this paper are described as follows. (11) $\begin{array}{l} A (x) = {\begin{array}{l} 0, 0 \leq x \leq 81 \\ (x - 81) / 8, 81 < x \leq 89 \\ 1, 89 < x \leq 100 \end{array}, & D (x) = {\begin{array}{l} 1, 0 \leq x \leq 61 \\ 1 - (x - 61) / 8, 61 < x \leq 69 \\ 0, 69 < x \leq 100 \end{array} \\ B (x) = {\begin{array}{l} 0, 0 \leq x \leq 71 \\ (x - 71) / 8, 71 < x \leq 79 \\ 1, 79 < x \leq 81 \\ 1 - (x - 81) / 8, 81 < x \leq 89 \\ 0, 89 < x \leq 100 \end{array}, & C (x) = {\begin{array}{l} 0, 0 \leq x \leq 61 \\ (x - 61) / 8, 61 < x \leq 69 \\ 1, 69 < x \leq 71 \\ 1 - (x - 71) / 8, 71 < x \leq 79 \\ 0, 79 < x \leq 100 \end{array} \end{array}$ \matrix{{A(x) = \left\{{\matrix{{0,0 \le x \le 81} \hfill \cr {(x - 81)/8,81 < x \le 89} \hfill \cr {1,89 < x \le 100} \hfill \cr}} \right.,} \hfill & {D(x) = \left\{{\matrix{{1,0 \le x \le 61} \hfill \cr {1 - (x - 61)/8,61 < x \le 69} \hfill \cr {0,69 < x \le 100} \hfill \cr}} \right.} \hfill \cr {B(x) = \left\{{\matrix{{0,0 \le x \le 71} \hfill \cr {(x - 71)/8,71 < x \le 79} \hfill \cr {1,79 < x \le 81} \hfill \cr {1 - (x - 81)/8,81 < x \le 89} \hfill \cr {0,89 < x \le 100} \hfill \cr}} \right.,} \hfill & {C(x) = \left\{{\matrix{{0,0 \le x \le 61} \hfill \cr {(x - 61)/8,61 < x \le 69} \hfill \cr {1,69 < x \le 71} \hfill \cr {1 - (x - 71)/8,71 < x \le 79} \hfill \cr {0,79 < x \le 100} \hfill \cr}} \right.} \hfill \cr}

The original data in Table 3 were processed according to the trapezoidal membership function; the results are shown in Table 4. Due to space constraints, we only listed the four values of index scores.

Table 4

The data after fuzzy processed.

Index 1				Index 2				Index 3				Index 4
A	B	C	D	A	B	C	D	A	B	C	D	A	B	C	D
0	0	1	0	0.375	0.625	0	0	0	0.625	0.375	0	0	0	1	0
0	0.875	0.125	0	0	0.875	0.125	0	0.625	0.375	0	0	0.625	0.375	0	0
0.75	0.25	0	0	1	0	0	0	0	0.625	0.375	0	0.75	0.25	0	0
0.625	0.375	0	0	0	0	1	0	0.125	0.875	0	0	0.125	0.875	0	0
0	0	0.875	0.125	0	0	0.75	0.25	0.75	0.25	0	0	0	0.625	0.375	0
0	0	1	0	0.75	0.25	0	0	1	0	0	0	0.375	0.625	0	0

4.4

Selecting the index with high weight

The decision tree was constructed using the algorithm as shown in Table 1. According to the performance evaluation system, the data samples are 24 second indexes processed by the decision tree. Every information gain ratio of the evaluation index is calculated by the C4.5 decision tree algorithm. There are 11 indexes with high weight according to the result of calculation. The information gain ratio is shown in Table 5.

Table 5

Laboratory evaluation index system.

Number	Indexes	Information gain ratio
1	Area and environment	33.51%
2	Instruments and equipment	28.46%
3	Operation and maintenance	27.63%
4	System and management	25.321%
5	Practice ability	21.25%
6	Service efficiency	20.87%
7	Personnel structure	19.32%
8	Comprehensive and designed experiments	19.87%
9	Experiment project for college student	18.56%
10	Hazmat management	17.62%
11	Innovative entrepreneurship	16.93%

4.5

Construct the optimised BP neural network

The indexes selected by decision tree classification as the input of the neural network and the corresponding evaluation results from experts as the output. The BP neural network was constructed by using Python-based programming and TensorFlow tools.

Sigmod function was chosen as network activation function in this paper. (12) $f (x) = \frac{1}{1 + e^{- x}}, x \in (0, 1)$ f(x) = {1 \over {1 + {e^{- x}}}},\;x \in (0,1)

Delta machine learning rules were adopted, aimed at increasing the convergence, reducing the error rate and avoiding local minima values. The objective function of the error signal is calculated as follows: (13) $Δ w_{ij} (m) = η (y_{i} (m) - P_{j} (m)) Q_{i} (m)$ \Delta {w_{ij}}(m) = \eta ({y_i}(m) - {P_j}(m)){Q_i}(m)

In formula (12), Δw_ij (m) represents the connection weight between neuron i and neuron j when the input vector is x_m; η and y_i are learning efficiency and output expectation of neuron i, respectively. Q_i and P_j represent activation values of the neuron i and neuron j. Experimental results show that error range reaches to 0.0001 and η is 0.5, when the neural network was trained 800 times.

Theoretically, the BP neural network with a single hidden layer can approximate any rational function. The number of hidden layers increasing will make the network more complex, which leads to increase the training time of the neural network and reduce the training efficiency.

It has been proved theoretically by Hecht-Nielsen that any continuous function in a closed interval can be approximated by a BP network with a hidden layer. A three-layer BP network can perform any mapping from M dimension to N dimension [42].

Studies have shown that there is no mature theory to accurately determine the number of neurons in hidden layers. Therefore, the numbers of neurons in hidden layer are generally determined based on previous empirical formulas. At present, common empirical formulas are as follows: (14) $C = \sqrt{m + n} + a, a \in [1, 10]$ C = \sqrt {m + n} + a,\;a \in [1,10]

C is the number of neurons in hidden layer, m is neurons in output layer and n is neurons in input layer. Experiments show that the overall performance of the neural network is the highest when m = 1, n = 11 and a = 4. That is to say, neurons in hidden layer are 7. The structure of the BP neural network is shown in Figure 2.

4.6

Verification and comparison

4.6.1

Test the predict accuracy

To test and predict accuracy of the trained test network, a series of experiments were carried out. The training sample set was discretised first based on information entropy. The randomisation algorithm is used to disrupt the order of the sample set. Then, 75% of the samples were selected as training samples and the remaining 25% were selected as test samples.

The comparison of real values and predicted values is shown in Table 6. The simulation results are very close to the real values, which shows that the method is feasible and effective in performance evaluation of college laboratories.

Table 6

Comparison of real values and predicted values.

Laboratory No.	1	2	3	5	6	7
Real values	82	76	96	65	85	92
Predicted values	82.3	76.2	95.8	62.3	85.1	91.8

4.6.2

Comparison of three models

Three experiments were carried; each experiment used the same data samples to ensure the fairness of the results. The first experiment approach adopted the model of performance evaluation with use of a single decision tree, and the second experiment adopted the model with the use of a single BP neural network; the third experiment adopted the method proposed in this paper. Each experiment was repeated 50 times; the average value was selected as the experimental result. The comparison of experimental results is shown in Table 7. The average accuracy of the performance method combines the decision tree and BP neural network 86.28. The data in the table indicate that the model introduced in this article is superior to other two models, which verifies the accuracy and rationality of the evaluation model.

Table 7

Comparison of experimental results.

Sequence	Number of training samples	Accuracy
Sequence	Number of training samples	Decision tree	BP neural network	Decision tree and BP neural network
1	300	76.4	73.8	81.6
2	400	78.2	72.5	82.5
3	500	81.6	77.6	84.3
4	600	82.1	79.3	86.1
5	700	81.6	80.4	88.5
6	800	82.1	81.5	86.2
7	900	83.4	82.9	89.4
8	1000	84.5	83.2	91.6
Average value		81.24	78.9	86.28

The neural network error curve comparison is shown in Figure 3. We can see from Figure 3 that the initial training error of the method is less than that of the other two methods, and it reaches the minimum error first. It shows that the classification accuracy of the model is improved.

Conclusion

Finding out a scientific qualitative and quantitative evaluation method is a challenging task during the process of college laboratory performance evaluation. Although the decision tree technique is a way with high speed and high veracity in prediction of classification and BP neural network is a method with highly nonlinear mapping ability and self-learning ability, both of them have their own shortcomings in constructing evaluation models. For this reason, the evaluation model which combines the decision tree with the BP neural network is proposed in this article. The model overcomes the shortages of separate model, eliminates the disturbance of human factors and improves the accuracy of the evaluation.

The research of this paper demonstrates some obvious advantages, but it needs to be improved in some aspects as follows:

There is still room for improving the decision tree and BP neural network algorithm adopted in this paper, such as the improvement of the split index of decision tree and the improvement of convergence speed of the BP neural network.

Some evaluation indexes are difficult to quantify during the processes of evaluation, such as laboratory rules and regulations. These evaluation indexes are not selected, which may affect the results of evaluation to some extent. Therefore, the evaluation index system needs to be optimised in future works.

eISSN:: 2444-8656
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: Volume Open
Fachgebiete der Zeitschrift:: Biologie, andere, Mathematik, Angewandte Mathematik, Allgemeines, Physik

Zeitschrift RSS Feed

Performance evaluation of college laboratories based on fusion of decision tree and BP neural network

Online veröffentlicht: 08. Apr. 2022

Seitenbereich: 1 - 14

Eingereicht: 06. Jan. 2021

Akzeptiert: 29. Sept. 2021

DOI: https://doi.org/10.2478/amns.2022.1.00001

Schlüsselwörter
Performance evaluation, laboratory, decision tree, BP neural network

© 2021 Chang Yujie et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Fig. 1

Fig. 2

Fig. 3

Laboratory number	Index 1	Index 2	Index 3	Index 4	Index 5

1	C	B	B	C	B
2	B	B	A	A	B
3	A	A	B	A	B
4	A	C	B	B	B
5	B	D	A	B	B
6	C	B	A	B	B
7	A	A	A	A	A
8	B	D	C	B	C
9	B	D	C	C	C
10	A	B	A	A	A

Laboratory number	Index 1	Index 2	Index 3	Index 4	Index 5

1	C	B	B	C	B
2	B	B	A	A	B
3	A	A	B	A	B
4	A	C	B	B	B
5	B	D	A	B	B
6	C	B	A	B	B
7	A	A	A	A	A
8	B	D	C	B	C
9	B	D	C	C	C
10	A	B	A	A	A

Performance evaluation of college laboratories based on fusion of decision tree and BP neural network

Online veröffentlicht: 08. Apr. 2022

Seitenbereich: 1 - 14

Eingereicht: 06. Jan. 2021

Akzeptiert: 29. Sept. 2021

DOI: https://doi.org/10.2478/amns.2022.1.00001

SchlüsselwörterPerformance evaluation, laboratory, decision tree, BP neural network

© 2021 Chang Yujie et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Fig. 1

Fig. 2

Fig. 3

Schlüsselwörter
Performance evaluation, laboratory, decision tree, BP neural network

Laboratory number	Index 1	Index 2	Index 3	Index 4	Index 5

1	C	B	B	C	B
2	B	B	A	A	B
3	A	A	B	A	B
4	A	C	B	B	B
5	B	D	A	B	B
6	C	B	A	B	B
7	A	A	A	A	A
8	B	D	C	B	C
9	B	D	C	C	C
10	A	B	A	A	A