1. bookAHEAD OF PRINT
Journal Details
License
Format
Journal
eISSN
2444-8656
First Published
01 Jan 2016
Publication timeframe
2 times per year
Languages
English
Open Access

The Technical Research on the Assessment of Network Security Situation Based on D-S Evidence Theory

Published Online: 15 Jul 2022
Volume & Issue: AHEAD OF PRINT
Page range: -
Received: 16 Feb 2022
Accepted: 18 Apr 2022
Journal Details
License
Format
Journal
eISSN
2444-8656
First Published
01 Jan 2016
Publication timeframe
2 times per year
Languages
English
Introduction

With the rapid development of computer technology and the increasing expansion of the Internet scale, the network security problems are becoming more and more prominent. What's more, the number of security loopholes and security incidents is soaring rapidly. Hence, how to ensure the normal operation of the network effectively has become a critical problem. Luckily, network security situation awareness technology [1] is able to make full use of the existing network security technologies, coordinate and manage various security equipment, perceive the overall network security situation, identify the attacks by integrating all available information, and select an appropriate security defense mechanism.

The existing researches on overall network security situation assessment methods mainly focus on the following three categories:

Based on probability statistics, the typical methods include Bayesian inference, hidden Markov model, analytic hierarchy process and so on;

Based on logical reasoning, the typical methods include D-S evidence theory, fuzzy logic and so on;

Based on neural network, the typical methods include BP neural network, radial basis function network [2] and so on.

Fang Yan [3] proposed a quantitative evaluation method for network security based on Bayesian network. She combined Bayesian network and attack graphs to assess the overall and local network situation. The assessment result is somehow objective and accurate. Chen Hong [4] and her team put forward a security situation assessment system combined with multi-source data based on D-S evidence theory. The system comprehensively considers the information of nodes and links to realize its situation, and integrates multi-source data by using D-S evidence theory to obtain the overall security situation of the network.

Zhang Yong [5] came up with a method which improved the hidden Markov model. The model divides the situation awareness process into three levels: thematic level, element level and overall level. Different levels adopt different feature fusion methods. Hidden Markov model is used to thematic level and Markov game model to element level. The improved method is more objective and efficient. However, the assessment on the overall level is very subjective since the ratio is artificially allocated.

C Wang [6] raised a situation assessment model based on information fusion by using an improved D-S evidence theory. When doing the threat assessment, D-S evidence theory and dynamic Bayesian network were used to integrate multi-source sensor data. Wang Yongwei and his team [7] proposed a situation assessment model based on the improved D-S evidence theory, which includes four stages: rule measurement, evidence modification, rule integration and situation decisions. Evidence is integrated by calculating the dissimilitude degree among evidences to complete feature fusion and the calculation of security situation value. Compared with the traditional method of D-S evidence theory, this method could avoid the paradox problem in the fusion of situations based on the evidence theory. What is more, the results are more objective. Tang Yongli and his team [8] put up to use an optimized Back Propagation (BP) neural network based on genetic algorithm to capture the basic probability distribution (BPA) of D-S evidence theory.

It can be seen from the research status at home and abroad that, most of the existing research results have the problems of unitary methodology and large-scale subjective factors, further affecting the accuracy of situation assessment. Combined with the traditional network security technology, this paper sets up a three-layer evaluation index system, and puts forward an overall network security situation evaluation model based on hidden Markov model, PageRank algorithm and D-S evidence theory, through which, the overall situation is evaluated by different fusion algorithms in different fusion stages. And compared with the single fusion algorithm, this method is more accurate and comprehensive. Its validity and accuracy are verified by comparison experiments.

Technical framework and assessment indicators
Multi-level assessment framework for the overall situation

In order to assess the network security situation as fully as possible, this paper establishes a multilevel assessment framework, which is shown in Figure 1. At first, considering the multi-index data of network security situation assessment, the data information extracted from various data sources is taken as the information needing to be input to assess the situation from the factor level and the overall level.

Figure 1

the technical framework of overall situation assessment

The original data layer includes three data categories. The first one is system logs, security device logs, alarm logs and application logs, from which, threat events could be extracted. The second one is vulnerability scanning results, from which, vulnerability information in nodes could be extracted. The third one is the service information in the node, which could count out all assets contained in the nodes.

The extraction layer of situation indicator data: it is used to extract indicator data information needed by the element layer from original data layer.

The factor layer: Firstly, it uses the hidden Markov model to integrate the multi-source heterogeneous network security data, so as to obtain the threat situation weight, vulnerability situation weight and asset situation weight in each node. Then, it utilizes the PageRank algorithm to assign weights to each node. Hence the threat situation weight, vulnerability situation weight and asset situation weight in the whole network is obtained by weighting.

Overall layer: it uses D-S evidence theory to integrate three situation indicators including network threat, vulnerability and assets at the decision-making level to capture the overall network security situation.

Assessment indicators

The assessment indicators of network security situation should pay attention to all kinds of internal components of the network and their relationships. Network data may be generally divided into two types. One is the information reflecting the actual operation status of the network system, and the other is the information reflecting the security of the network system. Therefore, this paper constructs three first level indicators, namely threat, vulnerability and assets.

Threat is an indicator that reflects the attack status the network suffers. This could be obtained from logs or alarms in various security devices, such as intrusion detection systems, firewalls, web firewalls and so on.

Vulnerability, as an indicator, reflects the possible vulnerabilities provided by network equipment systems and network services, and it reflects their impacts as well.

Assets, as an indicator, reflect the asset information of each device on the network and their importance levels.

Among them, each first level indicator is represented by a certain number of second level indicators. Therefore, we make a list about index system of network security situation in Table 1.

The index system of network security situation

First level indicators Second level indicators
Threat The types and the number of attack events
The frequency of attack events
The danger level of attack events
Vulnerability Total number of vulnerabilities
The number of vulnerabilities on different levels
The probability of vulnerabilities being successfully exploited
Assets The number of services on target nodes
The value level of each service on target nodes
Overall network situation assessment
The assessment of node situation weight based on hidden Markov model

On the basis of the comprehensive analysis of various indicators in situation assessment, Hidden Markov Model (HMM) [9] is proposed as a first level indicator method for situation assessment. This method is mainly used to assess three situation elements of network nodes, namely threat, vulnerability and assets. Let's take the indicator of threat as an example. First, we set a hidden Markov model for the threat indicator of each node. Second, we find out the parameters of the model based on the Baum-Welch algorithm [10]. Then, we evaluate the threat situation weight of nodes based on the Viterbi algorithm [11]. The other two indicators as vulnerability and assets are of the same calculation method.

The situation weights of threat, vulnerability and asset on each node could be regarded as the hidden state of the hidden Markov model. As time goes on, they can be regarded as a changing Markov chain. The network attack information, vulnerability information and asset information of the nodes could be obtained through various detection methods. All of them could be considered as the observation sequence in the hidden Markov model. Hence, this problem can be described as to infer the hidden security state of the node from the observation sequence. As a result, the hidden Markov model is suitable for this assessment.

The establishment of assessment model

Firstly, we need to evaluate the threat situation weight of each node and set up the process of the hidden Markov model:

State pace S = {1, 2,3, 4,5}. It is a parameter used to describe the threat status of a node and to measure the severity of the attack on the node. The larger the value is, the more dangerous the node would be.

Observation state space O = {1, 2,…,L}, among which the “L” represents the total amount of observation states. For the second level indicators required to evaluate the threat situation weight, there are three data sources assumed as follows: IDS alarm data, Web attack data and abnormal access data of firewalls, each of which has two risk levels including mild level and severe level. Therefore, six second level assessment indicators can be extracted from the above three data sources: the number of mild IDS alarm, the number of severe IDS alarm, the number of mild Web attack, the number of severe Web attack, the number of mild abnormal firewall access, and the number of severe abnormal firewall access, marked as r = {r1, r2,…,r6}. Among them, ri(1≤i≤6) represents the statistical number of the second level indicator “i”. In order to reduce the dimension of the observation state space “ O ”, a fuzzy statistical method is adopted here, that is . In Table 2, there are three statistical values representing “ ri ”, each value corresponding to the number of alarms. Values can be adjusted according to the actual situation.

Therefore, according to the six second level assessment indicators mentioned above and the three value ranges of each assessment indicator, the observation state space O has 36 states, that is O = {1,2,...,36}

In observation sequence Y = {y1,y2,...,yT}, y1,y2,...,yTO represents the observation sequence formed by IDS alarms and system log data collected by the system from the initial time t = 1 to the current time T.

The initial probability distribution of the system is π = {π1, π2, π3, π4, π5}. It represents the probability distribution of the threat situation of the node respectively in five states at the initial moment t = 1.

In the hidden state transition matrix A = [aij]5×5, aij = P(Xt+1 = j|Xt = i), i, jS represents the probability of the node's threat situation weight, which sets as j at the time t + 1 while i at the time t.

In the observation state transition matrix B = [bjk]L×5, when bij = P(Y = k|X = j), jS, kO, L means the number of observation states, which represents the probability of the node's threat situation weight generating observation states k under the condition j and at the time of t.

The corresponding relationships between the statistical number of each second level indicator and the number of threat alarms

The statistical amount of each second level indicator The number of threat alarms
Less [0, 20)
Medium [20,50)
More [50, +∞)

So far, the modeling about the node's threat weight has been completed, which is denoted as λ = {π, A, B}. Among them, the hidden state matrix A and the observation state matrix B are generally unknown. In order to eliminate the impact from subjective factors, it is necessary to train the models, so that two important parameters including the hidden state matrix A and the hidden observation state matrix B could be obtained.

The model training

Generally, Baum-welch algorithm is used for training. During the algorithm training process, the length of observation sequence Y = {y1,y2,...,yT} will be longer and longer as time goes by. Hence it will cost a lot of time since the training starts at the time t = 1. This paper adopts sliding window algorithm [12], in which, the length of the window is set as N, and only the observation data in this window will be trained each time. After the first training, a result λi will be obtained. Then the training turns to the next window to get a result λi+1. λi and λi+1 will be integrated through a certain proportion to get the equation as λi+1*=w1λi+w2λi+1 \lambda _{i + 1}^* = {w_1} \cdot {\lambda _i} + {w_2} \cdot {\lambda _{i + 1}} . This is a complete training process of the hidden Markov model. This method can effectively reduce the training complexity of events and improve training efficiency. The algorithm flow is shown in Figure 2.

Figure 2

HMM training through sliding window algorithm

After obtaining the parameters of hidden Markov model, the hidden state sequence can be solved. The hidden Markov model suffers a decoding problem and could be solved through Viterbi algorithm. Viterbi algorithm is a process used to search for the most possible hidden state sequence, the parameters of hidden Markov model and an observation sequence being given at that time. This algorithm adopts a dynamic programming optimized gradually from back to front in order to search for the best path of security situation change. Please refer to [13] for the specific process of the algorithm.

Evaluation algorithm of node situation weight

The evaluation algorithm using hidden Markov model to evaluate node's threat, vulnerability and assets situation weights is shown as follows:

Input: the observation sequence is consist of attack threat information, vulnerability information and asset information extracted from the extraction layer of situation indicator data.

Output: the threat, vulnerability and asset situation weight values of each network node.

Algorithm steps:

Build up the modeling of the threat situation of each node based on the hidden Markov model;

Train the hidden Markov model based on the Baum-Welch algorithm through sliding windows;

Find out the optimal hidden state sequence of hidden Markov model based on the Viterbi algorithm, and then calculate the threat situation weight of the node.

Adopt the same method as steps (2) and (3) to get the vulnerability and asset situation weight values of nodes.

As a result, the training and evaluation algorithms of hidden Markov model would be completed. And the overall evaluation process of the node's threat, vulnerability and asset situation weights is shown in Figure 3.

Figure 3

The overall process of situation weight assessment

The purpose of using situation assessment method based on hidden Markov model is to evaluate the node's threat, vulnerability and asset situation weight values. Taking attack threat information, vulnerability information and asset information as data sources, based on sliding windows, the Baum-Welch algorithm and Viterbi algorithm are used respectively to evaluate the situation weights of nodes.

The assessment technology of network situation weights based on PageRank algorithm

In this section, based on the hidden Markov model evaluating node's threat, vulnerability and asset situation weights, PageRank algorithm is introduced into multi-node network security situation assessment, in order to solve the calculation problem of node weights in complex network environment [1415]. Through this algorithm, the weight information of each node can be calculated, and the threat, vulnerability and asset situation weights in the whole network can be calculated by weighting method.

The connection relations of each node in the network can be obtained through the network topology information, and the latter one can be abstracted into a directed graph. Let's assume the total number of nodes is N. And for a node hi, it has j relations, namely l(1, i), l(2, i),..., l(j, i). Then the calculation formula for evaluating the weight W(i) of the network node is: W(hi)=αhjMhiW(hj)Out(hj)+1αN W\left({{h_i}} \right) = \alpha \sum\limits_{{h_j} \in {M_{{h_i}}}} {{{W\left({{h_j}} \right)} \over {Out\left({{h_j}} \right)}} + {{1 - \alpha} \over N}}

In this formula, Mpi represents the out-link set of the node hi, and the number of its out-link nodes is Out(hj), α representing the damping coefficient, which can be obtained after several experiments. All the above data can be gained from the network topology information.

This formula indicates that : (1) the more connections the network node hi has, the more access services hi will provide, and the higher its importance will be. (2) The higher the importance of the node hi connected to the network node hi, the higher the importance of the node hi will be.

Finally, by integrating the situation elements and weights of each node, the threat, vulnerability and asset situation elements in the whole network will be obtained. Let's take the situation of threat as an example, its calculation formula being as follows: NSSAthreat=i=1Nw(hi)SAthreat(hi) {NSSA}_{threat} = \sum\limits_{i = 1}^N {w\left({{h_i}} \right) \cdot {SA}_{threat}\left({{h_i}} \right)}

In the formula, NSSAthreat represents the situation element of threat in the whole network, hi represents the first network node, SAthreat(hi) represents the node hi's threat situation element, w(hi) represents the weight of the node hi, and N represents the total number of several nodes in the network. Similarly, the vulnerability and asset situation elements in the network take the same calculation method.

Network's overall situation assessment technology based on D-S evidence theory

On the basis of hidden Markov model and PageRank algorithm evaluating network's threat, vulnerability and asset situation elements, an overall evaluation method based on D-S evidence theory is proposed. This method mainly adopts the situation values of threat, vulnerability and assets to evaluate the overall network security situation. Firstly, the D-S evidence theory model is built according to the whole network situation. Secondly, the Basic Probability Assignment (BPA) required by the threat, vulnerability and assets in the D-S evidence theory is constructed. Finally, Dempster's composition rule is used to integrate three groups of evidences.

D-S evidence theory has the advantage of handling “uncertainty” at the decision level, and it could show the probability of the network security situation in each state, instead of a situation value, based on the reliability function and truth-like function. By constructing three evidence groups including threat, vulnerability and assets, the evidences of hidden Markov model's first-level evaluation results is synthesized to calculate the probability distribution of security situation in each state, which will finally help network security administrators to make decisions.

The process of overall network situation assessment based on D-S evidence theory is as follows.

Input: the output results of the threat, vulnerability and asset situation elements in the network based on the hidden Markov model and PageRank algorithm.

Output: the overall network security situation.

Algorithm steps:

Construct a hypothesis space P = {1, 2,3, 4,5} to represent five network security situations. The greater the situation value is, the more dangerous the network security situation will be.

Based on the assessment results of the network situation elements including threat, vulnerability and assets, and based on the five situations of hypothesis space P, three types of evidence bodies are constructed according to expert experience, namely threat evidence body mt, vulnerability evidence body mv and asset evidence body ma. What is more, the basic probability distributions of these three evidence bodies in various security states are determined;

(3) By the use of Dempster's composition rule, three evidence bodies are integrated. The new probability distributions m of various security situations under the joint effect of three evidence bodies will be calculated.

Dempster's composition rule about mt, mv and ma is as follows, (mtmvma)(A)=1KAtAvAa=Amt(At)mv(Av)ma(Aa) \left({{m_t} \oplus {m_v} \oplus {m_a}} \right)\left(A \right) = {1 \over K}\sum\limits_{{A_t} \cap {A_v} \cap {A_a} = A} {{m_t}\left({{A_t}} \right) \cdot {m_v}\left({{A_v}} \right){m_a}\left({{A_a}} \right)} in which, K=AtAvAamt(At)mv(Av)ma(Aa) K = \sum\limits_{{A_t} \cap {A_v} \cap {A_a} \ne \emptyset} {{m_t}\left({{A_t}} \right) \cdot {m_v}\left({{A_v}} \right) \cdot {m_a}\left({{A_a}} \right)}

(4) Determine the trust interval [Beli, Pli], i = 1,2,3,4,5 of various security situations and obtain the probability of network security situation in each state, in which Bel is the lower bound and Pl is the upper bound of the trust interval.

(5) Make decisions based on the results in step (4). When the situation assessment method based on D-S evidence theory is used, the overall security situation of the network could be evaluated. Taking the fusion results of hidden Markov model and PageRank algorithm as data sources, the final results are obtained by synthesizing the evidences and constructing the basic probability distributions on the basis of Dempster's composition rule. This method fits the situation of integrating decision-making levels, through which the assessment results become accurate and objective.

Simulation experiment
Experimental data

In order to test the effectiveness of the above method, DARPA 2000 dataset has been used as experimental data, since it is the most comprehensive dataset for attack tests and the standard test dataset which is widely recognized and used in the field of network security research.

This dataset includes the LLDOS 2.0 attack scenario which contains the following attack steps.

Scan the target network to search for any living hosts.

Detect if there is any living hosts with the sadmind vulnerability.

Invade and get the root access to the host 172.16.115.20 by making use of the sadmind vulnerability.

Install the Trojan horse on the host 172.16.115.20.

Invade another host 172.16.115.20 by taking the host 172.16.115.20 as a springboard.

Install the Trojan horse on the host 172.16.112.50.

The attacker controls host 172.16.115.20 and host 172.16.112.50 to launch an attack of DDoS on host 131.84.1.31.

However, the data set has not presented some information on the target network, such as the network topology, vulnerability, and service. So, on the basis of the experimental data provided in the literature [16], this article builds a key network topology structure, which is shown in Figure 4, from the alarm data set. And Table 3 and Table 4 also show the vulnerability information and service information of the network.

Figure 4

The key network topology structure in DARPA 2000

The vulnerability structure of the key nodes in DARPA 2000

Vulnerability information mill locke parcal Hume robin www.af.mil
ICMP Incorrectly Configured ×
SunPRC Incorrectly Configured × × ×
Sadmind Buffer overflow × × ×
RCP Incorrectly Configured × × ×
HINFO Query Incorrectly Configured × × × × ×
SYN Flood × × × × ×

The service information of the key nodes in DARPA 2000

Service information mill locke parcal hume robin www.af.mil
HTTP × ×
FTP × × ×
TELNET × × ×
DNS × × × × ×
SMTP × × × × ×
POP3 × × × × ×
The assessment results based on the hidden Markov model

Let's take the threat situation of node mill (172.16.115.20) as an example to present the evaluation process of hidden Markov model. Firstly, we set a hidden Markov model according to the method mentioned in the section 3.3.1. Since there is only IDS alarm data in DARPA 2000 data set, three second level evaluation indicators are used here: IDS mild alarm number, IDS moderate alarm number and IDS severe alarm number, the values of which are shown in Table 2, so there are 33=27 observation states. Then we observe the data at a same time interval, and then divide it into 7 steps according to different time intervals, the observation sequence Yt of this node is shown as: Yt={5,8,16,19,24,27,27,24,22,11,8,11,19,24,27} {Y_t} = \left\{{5,8,16,19,24,27,27,24,22,11,8,11,19,24,\,27} \right\}

Based on the expert experience, the initial value of the parameters λ = (A, B, π) is set as: A=(0.20.40.20.10.10.10.20.40.20.10.200.20.40.20.4000.20.40.60000.4)B=(0.10.10.10.1)5×27π={0.05,0.05,0.2,0.5,0.2} \matrix{{A = \left({\matrix{{0.2} & {0.4} & {0.2} & {0.1} & {0.1} \cr {0.1} & {0.2} & {0.4} & {0.2} & {0.1} \cr {0.2} & 0 & {0.2} & {0.4} & {0.2} \cr {0.4} & 0 & 0 & {0.2} & {0.4} \cr {0.6} & 0 & 0 & 0 & {0.4} \cr}} \right)} \cr {B = {{\left({\matrix{{0.1} & \ldots & {0.1} \cr \vdots & \ddots & \vdots \cr {0.1} & \ldots & {0.1} \cr}} \right)}_{5 \times 27}}} \cr {\pi = \left\{{0.05,\,0.05,\,0.2,\,0.5,\,0.2} \right\}} \cr}

On the basis of expert experience, Baum-Welch algorithm is used to optimize model parameters. Table 5 shows the changes of model parameters' log likelihood values for host “mill” in the iterative process of Baum-Welch algorithm.

The changes of the log likelihood values for the host

Interaction Times 0 20 40 60 80 100
LL(log) −257 −64.1 −50.3 −46.1 −43.7 −43.4

It can be seen that, with the increasing number of iteration times, the log likelihood value of model parameters λ = {A, B, π} gradually turns to be stable. In the meantime, it means that the training of model parameters also turns to be stable. Then, Viterbi algorithm is used to evaluate the node's situation element of threat, the results of which are shown in Figure 5, where the blue line represents the trend chart of observed data and the red line means the trend chart of the situation element of threat.

Figure 5

Threat situation diagram for the node “mill”

In Figure 5, it shows the trend chart about the threat situation element of the host node “mill”, through which we can intuitively see the change trend of the host's threat situation during this period of time:(1) network attacks are mainly concentrated in the period of 4–9 and 14–15, during which network security is threatened and needs to be kept watch; However, in other periods, especially in the periods of 1–3 and 10–13, the security situation is often kept at a low level, which indicates that the network is relatively safe in these two periods. (2) By comparing two line graphs, it can be seen that the number of network attacks and the change trend of node's threat situation are almost the same, indicating that network attacks will directly have an impact on the threat situation element of host nodes.

The vulnerability and asset situations of the host node “mill” can be obtained by the same method, but they are different from the threat situation graph. Because of their particularity, their fluctuations are slight, so the trend graph is relatively stable, as shown in Figure 6.

Figure 6

The graph of the vulnerability situation and the asset situation of node “mill”

The assessment results based on PageRank algorithm

Taking the evaluation on the threat situation in the network as an example, the network topology shown in Figure 4 is changed into the accessing relationship shown in Figure 7 for simplicity.

Figure 7

Accessing relationship graph of a sub network

The accessing relationship of the sub network can be represented by an adjacency matrix: M=(011111100001100001100001100001000000) M = \left({\matrix{0 & 1 & 1 & 1 & 1 & 1 \cr 1 & 0 & 0 & 0 & 0 & 1 \cr 1 & 0 & 0 & 0 & 0 & 1 \cr 1 & 0 & 0 & 0 & 0 & 1 \cr 1 & 0 & 0 & 0 & 0 & 1 \cr 0 & 0 & 0 & 0 & 0 & 0 \cr}} \right)

The damping coefficient value is set as α = 0.85, and the weights of the 6 nodes in the network, which is shown in Table 6, are finally obtained by the adjacency matrix and PageRank algorithm. By using the above nodes weight information and the threat situation of each node, the threat situation element of the overall network can be calculated, the results of which are shown in Figure 8.

The weight information of network nodes

Network Node Weight
mill 0.233
locke 0.123
pascal 0.123
hume 0.123
robin 0.123
www.af.mil 0.275

Figure 8

The threat situation element in the whole network

From the Figure 8, it can be seen that :(1) the attacker invades the host “mill” (172.16.115.20) in the period of 3–6, and invades the host “pascal” (172.16.112.50) in the period of 7–10. The situation value of the former one was higher than that of the latter one, because host “mill” has a bigger weight and a higher importance. (2) The situation value of the network reaches to its highest during the period of 14–15. This is because the attacker controls the host “mill” and “pascal” during this period and attacks the host www.af.mil based on DDoS, which meets the expectation. Similarly, network situations of vulnerability and assets can also be calculated, which is shown in Figure 9.

Figure 9

The situation elements of vulnerability and assets in the overall network

From Figure 9, the fluctuation range of network's vulnerability and asset situations is small, which is different from the threat situation. That is to say, during this period, the vulnerability situation and asset situation in the network are relatively stable without big risks.

The assessment results based on D-S evidence theory

Taking the 15th period as an example, the experimental steps of evaluating the overall network security situation based on the D-S evidence theory are as follows. (1) According to experts' experience, the basic probability distributions of the three elements, namely threat mt, vulnerability mv and assets ma, are shown in Table 7, 8 and 9.

The basic probability distribution of threat

Situation value The probability distribution of the threat evidence body {m(1), m(2), m(3), m(4), m(5)}
1 0.65 0.15 0.10 0.06 0.04
2 0.25 0.35 0.25 0.10 0.05
3 0.10 0.23 0.34 0.23 0.10
4 0.05 0.10 0.25 0.35 0.25
5 0.04 0.06 0.10 0.15 0.65

The basic probability distribution of vulnerability

Situation value The probability distribution of vulnerability evidence body {m(1), m(2), m(3), m(4), m(5)}
1 0.50 0.22 0.13 0.10 0.05
2 0.30 0.30 0.25 0.10 0.05
3 0.10 0.25 0.30 0.25 0.10
0.10 0.25 0.30 0.30
0.10 0.13 0.22 0.50

The basic probability distribution of assets

Situation Value The probability distribution of the asset evidence body {m(1), m(2), m(3), m(4), m(5)}
1 0.70 0.20 0.05 0.03 0.02
2 0.30 0.35 0.20 0.10 0.05
3 0.15 0.25 0.30 0.20 0.10
4 0.05 0.10 0.20 0.35 0.30
5 0.02 0.03 0.05 0.20 0.70

(2) Construct evidence bodies

According to the basic probability distributions of threat, vulnerability and assets in step (1), the probability distributions of three evidence types are shown as follows:

mt = (0.05,0.1,0.13,0.22,0.5)

mv = (0.1,0.2,0.4,0.2,0.1)

ma = (0.05,0.1,0.25,0.35,0.25)

(3) Synthesize evidences

Dempster's composition rule is used to integrate three evidence bodies, namely mt, mv and ma. Firstly, two evidence bodies are integrated:

m = mtmv = (0.07,0.12,0.16,0.42,0.23)

Similarly, Dempster's composition rule is used to integrate m and masset to get the following formula:

m = mma = (0.0,0.03,0.17,0.56,0.24)

(4) Determine the trust intervals.

The trust intervals m are determined after integrating mt, mv and ma, network situation value being from 1 to 5. Among them, when the situation value is 1, the trust measure and likelihood measure are shown as follows:

B(1) = 0

L(1) = 0

When the situation value is 2, the trust measure and likelihood measure are shown as follows:

B(2) = 0.02

B(2) = 0.04

When the situation value is 3, the trust measure and likelihood measure are shown as follows:

B(3) = 0.13

B(3) = 0.20

When the situation value is 4, the trust measure and likelihood measure are shown as follows:

B(4) = 0.55

B(4) = 0.63

When the situation value is 5, the trust measure and likelihood measure are shown as follows:

B(5) = 0.21

B(5) = 0.28

When the situation value of network is 1, the trust interval is [0, 0]; the trust interval is [0.02, 0.04] when the situation value is 2; the trust interval is [0.13, 0.20] when the situation value is 3; the trust interval is [0.55, 0.63] when the situation value is 4; and the trust interval is [0.21, 0.28] when the situation value is 5.

It can be seen from the results that, when the probability of security situation in the overall network is 4, it reaches to the highest and is followed by 5. Hence, the network here is dangerous, which needs some remedial measures in time. Meanwhile, the overall network situation value in the 15th period can be calculated as 4.14. The change trend of the overall network situation in the time period of 1–15 is shown in Figure 10.

Figure 10

The trend graph of overall network situation

From the Figure 10, it can be seen that the overall network situation value is between 3.1 and 4.6, and it maintains an upward trend after combining three situation indicators including threat, vulnerability and assets. This is almost similar with the change trend of network threat. The reason is that the network vulnerability and assets are stable in the selected time period, with its range ability being small. As a result, this method is proved to be effective since it accords with the expectation of the experimental results.

A comparison with other evaluation techniques

In order to verify the accuracy of the proposed method, this section makes a comparison with other evaluation techniques in the paper [4] and Paper [16] with DARPA 2000 data set as experimental data. In the paper [4], it evaluates the situation by analytic hierarchy method. While in the paper [16], it takes threat, vulnerability and service as evaluation indicators to evaluate situations based on the D-S evidence theory. The changing trend of the overall network situation calculated by the three methods is shown in Figure 11.

Figure 11

The comparison graph of the experimental results of three methods

From the Figure 11, the experimental results indicate that: (1) The method in the paper [16] mainly considers network's threat indicator, but it does not pay much attention to the vulnerability indicator and asset indicator, so this method can reflect network's threat situation accurately. However, there are three elements for network risk assessment, namely threat, vulnerability and assets. Therefore, this method mainly reflects the network's threat situation but not the overall situation of the network. The method proposed in this paper can take the above three factors into consideration at the same time, which is more accurate and comprehensive. (2)The change of the method in the paper [4] is in doubt, so there is only a reasonable result shown in the graph. Since this method is easily affected by artificial factors when integrating the first and second level indicators, and the integration standards of various people are quite different, so each indicator will be assigned a specific fusion weight. As a result, this method is more difficult to adapt to the complex network environment. However, the method in this paper eliminates the artificial factors as much as possible, and comprehensively considers various factors such as threat, vulnerability, assets and network topology through self-learning. As a result, its ability of adapting to different network environments is stronger.

Conclusion

This paper mainly describes the process and methods about the overall situation assessment of network security, on the basis of a comprehensive assessment algorithm combined with hidden Markov model, PageRank algorithm and D-S evidence theory. It uses different fusion algorithms in different levels to assess the overall situation. Firstly, a technical framework of overall situation assessment is established, which could be divided into four layers, namely, from bottom to top, data source layer, data extraction layer of situation indicators, element layer and overall layer. Secondly, three first-level indicators including threat, vulnerability and assets, and several second-level indicators are given in the paper. Then, with the hidden Markov model, the second level indicators are integrated to evaluate the threat, vulnerability and asset situation elements of each node in the network. And when integrating the threat, vulnerability and asset situation elements of each node based on the PageRank algorithm, the whole network's threat, vulnerability and asset situation elements could be evaluated. Meanwhile, by integrating the threat, vulnerability and asset elements at the decision level based on the D-S evidence theory, the security situation of the whole network could be evaluated. And, in the simulation experiment, it shows that the assessment method proposed in this paper is more accurate, effective and suitable than the one in a single model.

Figure 1

the technical framework of overall situation assessment
the technical framework of overall situation assessment

Figure 2

HMM training through sliding window algorithm
HMM training through sliding window algorithm

Figure 3

The overall process of situation weight assessment
The overall process of situation weight assessment

Figure 4

The key network topology structure in DARPA 2000
The key network topology structure in DARPA 2000

Figure 5

Threat situation diagram for the node “mill”
Threat situation diagram for the node “mill”

Figure 6

The graph of the vulnerability situation and the asset situation of node “mill”
The graph of the vulnerability situation and the asset situation of node “mill”

Figure 7

Accessing relationship graph of a sub network
Accessing relationship graph of a sub network

Figure 8

The threat situation element in the whole network
The threat situation element in the whole network

Figure 9

The situation elements of vulnerability and assets in the overall network
The situation elements of vulnerability and assets in the overall network

Figure 10

The trend graph of overall network situation
The trend graph of overall network situation

Figure 11

The comparison graph of the experimental results of three methods
The comparison graph of the experimental results of three methods

The index system of network security situation

First level indicators Second level indicators
Threat The types and the number of attack events
The frequency of attack events
The danger level of attack events
Vulnerability Total number of vulnerabilities
The number of vulnerabilities on different levels
The probability of vulnerabilities being successfully exploited
Assets The number of services on target nodes
The value level of each service on target nodes

The weight information of network nodes

Network Node Weight
mill 0.233
locke 0.123
pascal 0.123
hume 0.123
robin 0.123
www.af.mil 0.275

The corresponding relationships between the statistical number of each second level indicator and the number of threat alarms

The statistical amount of each second level indicator The number of threat alarms
Less [0, 20)
Medium [20,50)
More [50, +∞)

The service information of the key nodes in DARPA 2000

Service information mill locke parcal hume robin www.af.mil
HTTP × ×
FTP × × ×
TELNET × × ×
DNS × × × × ×
SMTP × × × × ×
POP3 × × × × ×

The vulnerability structure of the key nodes in DARPA 2000

Vulnerability information mill locke parcal Hume robin www.af.mil
ICMP Incorrectly Configured ×
SunPRC Incorrectly Configured × × ×
Sadmind Buffer overflow × × ×
RCP Incorrectly Configured × × ×
HINFO Query Incorrectly Configured × × × × ×
SYN Flood × × × × ×

The basic probability distribution of threat

Situation value The probability distribution of the threat evidence body {m(1), m(2), m(3), m(4), m(5)}
1 0.65 0.15 0.10 0.06 0.04
2 0.25 0.35 0.25 0.10 0.05
3 0.10 0.23 0.34 0.23 0.10
4 0.05 0.10 0.25 0.35 0.25
5 0.04 0.06 0.10 0.15 0.65

The basic probability distribution of assets

Situation Value The probability distribution of the asset evidence body {m(1), m(2), m(3), m(4), m(5)}
1 0.70 0.20 0.05 0.03 0.02
2 0.30 0.35 0.20 0.10 0.05
3 0.15 0.25 0.30 0.20 0.10
4 0.05 0.10 0.20 0.35 0.30
5 0.02 0.03 0.05 0.20 0.70

The basic probability distribution of vulnerability

Situation value The probability distribution of vulnerability evidence body {m(1), m(2), m(3), m(4), m(5)}
1 0.50 0.22 0.13 0.10 0.05
2 0.30 0.30 0.25 0.10 0.05
3 0.10 0.25 0.30 0.25 0.10
0.10 0.25 0.30 0.30
0.10 0.13 0.22 0.50

The changes of the log likelihood values for the host

Interaction Times 0 20 40 60 80 100
LL(log) −257 −64.1 −50.3 −46.1 −43.7 −43.4

Gong Jian, Zang Xiaodong, Su Qi etc. Survey of Network Security Situation Awareness [J]. Journal of Software, 2017, 28(4):1010–1026. JianGong XiaodongZang QiSu etc. Survey of Network Security Situation Awareness [J] Journal of Software 2017 28 4 1010 1026 Search in Google Scholar

Leau Y B, Manickam S, Chong Y W. Network Security Situation Assessment: A Review and Discussion[M]// Information Science and Applications. Springer Berlin Heidelberg, 2015:407–414. LeauY B ManickamS ChongY W Network Security Situation Assessment: A Review and Discussion[M] Information Science and Applications Springer Berlin Heidelberg 2015 407 414 10.1007/978-3-662-46578-3_48 Search in Google Scholar

Fang Yan, Yin Xiaochuan, Li Jingzhi. Research of quantitative network security assessment based on Bayesian-attack graphs [J]. Application Research of Computers, 2013, 30(9): 2763–2766 YanFang XiaochuanYin JingzhiLi Research of quantitative network security assessment based on Bayesian-attack graphs [J] Application Research of Computers 2013 30 9 2763 2766 Search in Google Scholar

Chen Hong, Wang Fei etc. Network security situation assessment model fusing multi-source data [J]. Computer Engineering and Applications, 2015, 51(17):96–101. HongChen FeiWang etc. Network security situation assessment model fusing multi-source data [J] Computer Engineering and Applications 2015 51 17 96 101 Search in Google Scholar

Yong Zhang. Research and system implementation of network security situation awareness model [D]. University of Science and Technology of China, 2010. ZhangYong Research and system implementation of network security situation awareness model [D] University of Science and Technology of China 2010 Search in Google Scholar

Wang C, Zhang Y. Network Security Situation Evaluation Based on Modified D-S Evidence Theory[J]. Wuhan University Journal of Natural Sciences, 2014, 19(5):409–416. WangC ZhangY Network Security Situation Evaluation Based on Modified D-S Evidence Theory[J] Wuhan University Journal of Natural Sciences 2014 19 5 409 416 10.1007/s11859-014-1033-1 Search in Google Scholar

WANG Yongwei LIU Yunan ZHAO Rongcai SI Cheng etc. Situation assessment method based on improved evidence theory [J]. Journal of Computer Applications, 2014, 34(2): 491–495. WANGYongwei LIUYunan ZHAORongcai SICheng etc. Situation assessment method based on improved evidence theory [J] Journal of Computer Applications 2014 34 2 491 495 Search in Google Scholar

Tang Yongli Li Weijie Yu Jinxia etc. Network security situational assessment method based on improved D-S evidence theory [J]. Journal of Nanjing University of Science and Technology, 2015, 34(4):405–411. TangYongli LiWeijie YuJinxia etc. Network security situational assessment method based on improved D-S evidence theory [J] Journal of Nanjing University of Science and Technology 2015 34 4 405 411 Search in Google Scholar

Greene A M, Holsclaw T, Robertson A W, et al. A Bayesian Multivariate Nonhomogeneous Markov Model[M]//Machine Learning and Data Mining Approaches to Climate Science. Springer International Publishing, 2015: 61–69. GreeneA M HolsclawT RobertsonA W A Bayesian Multivariate Nonhomogeneous Markov Model[M] Machine Learning and Data Mining Approaches to Climate Science Springer International Publishing 2015 61 69 10.1007/978-3-319-17220-0_6 Search in Google Scholar

Tu S. Derivation of baum-welch algorithm for hidden markov models[J]. 2015. TuS Derivation of baum-welch algorithm for hidden markov models[J] 2015 Search in Google Scholar

WANG Yuhui, ZENG Zehua, SHEN Jiahui FU etc. APT Logic-based Causality Analysis of Terrorist Incidents [J]. Netinfo Security, 2017(9). WANGYuhui ZENGZehua SHENJiahui FU etc. APT Logic-based Causality Analysis of Terrorist Incidents [J] Netinfo Security 2017 9 Search in Google Scholar

Cao lili. On-line fault diagnosis and multi-step fault prediction for TE Process based on HMM [D]. Huazhong University of Science and Technology, 2015. Caolili On-line fault diagnosis and multi-step fault prediction for TE Process based on HMM [D] Huazhong University of Science and Technology 2015 Search in Google Scholar

Hong-sheng xi. Introduction to stochastic processes [M]. University of Science and Technology of China Press, 2009. Hong-sheng xi. Introduction to stochastic processes [M] University of Science and Technology of China Press 2009 Search in Google Scholar

Cheng X, Lang S. Research on network security situation assessment and prediction[C]//Computational and Information Sciences (ICCIS), 2012 Fourth International Conference on. IEEE, 2012: 864–867. ChengX LangS Research on network security situation assessment and prediction[C] Computational and Information Sciences (ICCIS), 2012 Fourth International Conference on. IEEE 2012 864 867 10.1109/ICCIS.2012.249 Search in Google Scholar

Sarma A D, Molla A R, Pandurangan G, et al. Fast distributed pagerank computation[J]. Theoretical Computer Science, 2015, 561: 113–121. SarmaA D MollaA R PanduranganG Fast distributed pagerank computation[J] Theoretical Computer Science 2015 561 113 121 10.1016/j.tcs.2014.04.003 Search in Google Scholar

Wei Yong, Lian Yifeng, Feng Dengguo. A Network Security Situational Awareness Model Based on Information Fusion [J]. JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT, 2009, 46(3):353–362. WeiYong LianYifeng FengDengguo A Network Security Situational Awareness Model Based on Information Fusion [J] JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT 2009 46 3 353 362 Search in Google Scholar

Recommended articles from Trend MD

Plan your remote conference with Sciendo