An Analysis of Correlation and Comparisons Between Centrality Measures in Network Models

Complex networks serve as valuable tools across various scientific fields for representing real-world system characteristics. A network consists of nodes representing individuals within a population and edges representing their interactions. Over the past few decades, researchers have extensively studied different types of real-world networks, including biological networks, world-wide-web networks, and social networks(Albert, Jeong, & Barabsi, 1999; Amaral et al., 2000; Faloutsos, Faloutsos, & Faloutsos, 1999; Guelzim et al., 2002; Hncean et al., 2021; Hncean et al., 2022; Newman, 2001; Olfati-Saber, 2007; Voelkl & Kasper, 2009; Zahedian et al., 2022). Network analysis employs various metrics to understand both local and global properties. Locally, properties like node degree, clustering coefficient, and centrality describe the roles and connections at the node level. Globally, measures like efficiency, density, and average shortest path length characterize overall network structure and information flow. Together, local and global network metrics provide insights into the organization and dynamics of complex networks(Newman, 2018). Among these measures, centrality is widely used to explore networks by identifying crucial nodes and determining their relative importance(Oldham et al., 2019; Rodrigues, 2019). However, centrality cannot be defined in a singular manner, and multiple methods exist to measure the importance of nodes. Each measure is based on a different concept, and several approaches have been employed to identify influential nodes. Understanding the behavior of different centrality measures in determining importance is a fundamental question in the literature.

According to their definitions and conceptualizations, these measures can be grouped into four main categories: Degree Centralities, which assess the importance of nodes based on their visibility to neighboring nodes; Path Centralities, which evaluate the significance of nodes in facilitating communication between other nodes; Proximity Centralities, which determine the importance of a node based on its distance from other nodes; and Spectral Centralities, which assess a node’s centrality by considering its participation in the network and connections to other significant nodes. It is worth noting that centrality measures can exhibit different behaviors depending on the network’s structure. When a node is identified as topologically important based on a specific centrality measure, it does not necessarily imply importance when considering other measures. For example, a node with few connections may not be important in terms of degree, but it can serve as a bridge between different parts of the network, making it more central. Moreover, computing these measures can be time-consuming, and certain measures may require significant computational complexity, particularly for large networks. Centrality measures that account for the number of edges in a network tend to be more complex than measures solely based on the adjacency matrix. This is because tracking edge counts requires traversing the full network structure rather than just checking matrix entries. For example, calculating path centrality is more involved than degree centrality since it depends on paths between all nodes rather than just immediate neighbors. In general centrality metrics that incorporate global network information like edge counts or communication paths come with higher computational costs compared to simpler, more localized node statistics. Therefore, studying the relationship between centrality measures in various network structures and identifying scenarios where we can substitute one measure with another and achieve similar results is important and requires more attention.

In this paper we examine the correlations between centralities and the impact of different network model structures on these correlations, aiming to enhance our understanding of centrality’s potential role in studying and analyzing networks. Analyzing the relationship between centrality measures can enable us to employ less computationally demanding measures instead of other measurment in certain network configurations without compromising the accuracy of node ranking or importance assessment. For instance, we might substitute degree centrality for betweenness centrality in specific scenarios and have the same node ranking. This substitution would yield equivalent node rankings and importance evaluations while considerably reducing computational overhead. While each centrality measure possesses a conceptual underpinning and is associated with distinct social behaviors, our study delves into specific network structures where the ranking and importance of nodes as determined by one measure align with those of another(Li et al., 2015). In this study, we employ the Pearson correlation coefficient and Spearman’s rank correlation coefficient to analyze the correlations between these measures. For some networks, knowing the centrality value of each node is more crucial than its rank, and the Pearson coefficient considers the centrality values. However, in certain situations, the ranking order of nodes carries greater importance, and the Spearman correlation utilizes node ranking to calculate the correlation. We select four common centrality measures, namely degree, betweenness, eigenvector, and closeness, from a range of existing centrality measures. We chose these measures based on their widespread adoption and usage in various network-related studies and applications. These measures serve as the most used measure in the literature for quantifying the importance of nodes within networks, and they are well-established within the field of network science. The computational complexity of these centrality measures varies, with degree centrality having the lowest complexity and betweenness having the highest. Closeness centrality can be calculated faster than eigenvector centrality(Grando, Granville, & Lamb, 2018). These measures have been widely used in network analysis across various scientific disciplines. For the purpose of this study, we focus on three network models: the Erdös-Rényi model for random networks, the Barabási-Albert model for scale-free networks, and the Watts-Strogatz model for small-world networks. Creating these network models requires specific parameter settings. For instance, to create random networks according to Erdös-Rényi model, we need to specify the number of nodes and the probability of connections between them. This paper begins by aiming to investigate how these parameters affect the correlation between centrality measures. Furthermore, we examine the impact of global network properties, such as density, clustering, degree assortativity, global efficiency, majorization gap, and spectral gap, on these correlations. Our findings shed light on the relationship between network parameters and centrality measure correlations, providing valuable insights for further analysis.

Several researchers have conducted investigations to explore the relationships between different centrality measures(He, Meghanathan, & et al., 2016; Lee, 2006; Li et al., 2015; Oldham et al., 2019; Rodrigues, 2019; Ronqui & Travieso, 2015; Valente et al., 2008). For instance, Xiaojia and Meghanathan conducted a study that examined the Pearson correlations between eigenvector centrality and other measures in two network models, as well as in real-world networks(He, Meghanathan, & et al., 2016). Their analysis shows that degree centrality and eigenvector centrality have a high correlation, no matter the network type. Additionally, eigenvector centrality and betweenness are also highly correlated in random and real-world networks. Additionally, Cong Li investigated the relationship between various centrality measures across different types of real-world networks(Li et al., 2015). However, the relationship between network structure and centrality measures remains an open area of research. While past studies have examined real-world networks, there has been less focus on how systematic variations in model network structure impact centrality correlations. By tuning model parameters to generate networks with different structural properties, we can find the effects of structure on centrality relationships. In this work, we analyze a range of model networks to provide new insights into how structure influences the associations between different centrality metrics. By employing network models and manipulating their structural parameters, we can generate distinct networks with varying properties. This approach allows us to explore the intricate relationship between network structures and the behavior of centrality measures. By analyzing the correlation patterns affected by network structure, we can gain a deeper understanding of how network characteristics influence the similarity of different centrality measures. This knowledge can be utilized to identify networks where a particular centrality measure can effectively substitute another, providing equivalent node rankings. Our study aims to provide valuable insights into the intricate relationships between centrality measures in complex networks, shedding light on the network characteristics that strengthen these interrelations. By explicitly manipulating network structure, we can simulate the impact of various network properties on centrality measure correlations. This controlled experimentation allows us to identify the network features that lead to stronger correlations, a crucial step in understanding how centrality measures capture network structure and influence real-world phenomena. Our findings have significant implications for practical centrality applications in diverse networks, enabling the identification and exploitation of key nodes that play pivotal roles in maintaining network connectivity and functionality.

The paper is organized as follows. In Section 2, we provide an introduction to the centrality measures discussed in this paper and describe them in detail. Section 3 focuses on analyzing the influence of network structure on the correlations between centrality measures for the three network models under consideration. In Section 4, we delve into the examination of how the global properties of the network impact these correlations. Finally, Section 5 presents a summary of the findings and provides concluding remarks.

Centrality Measures

Centrality is a valuable concept that helps us assess the significance and relative importance of nodes within a network. Rather than having a fixed definition, the concept of centrality varies depending on the specific context in which it is applied(Rodrigues, 2019). In this paper, we have systematically selected the most commonly employed centrality measures from a wide array of literature-based options to investigate their correlation. Instead of relying on theoretical or conceptual considerations, we embrace an empirically driven approach to identify these frequently used centrality measures. This approach ensures that we incorporate measures that have demonstrated effectiveness in previous research, providing more meaningful correlations. We carefully chose prominent measures representing each centrality group for empirical analysis, prioritizing usage data rather than a theory-informed selection of less common measures. The following provides a brief explanation of the centralities examined in this study.

Degree centrality (DE):

The degree centrality of a node is defined as the number of connections it has with other nodes in the network. In terms of the adjacency matrix, the degree of a node denoted as d_i, can be calculated by summing the corresponding row or column in the matrix. Mathematically, it can be expressed as: 1 $d_{i} = \sum_{j = 1}^{N} A_{i j}$ {{d}_{i}}=\underset{j=1}{\overset{N}{\mathop \sum }}\,{{A}_{ij}} where A represents the adjacency matrix of the network and N is the total number of nodes in the network. To normalize this measure, we can divide the degree by the maximum possible degree, which is N-1. Degree centrality is a fundamental concept in network analysis, and it has numerous applications across various fields(Kapoor, Sharma, & Srivastava, 2013; Liu et al., 2016). For example in (Liu et al., 2016) the authors propose a new method to identify influential spreaders in networks by measuring node spreading ability based on degree and spreading capacity, regulated by a tuning weight parameter associated with network properties.

Betweenness centrality (BE):

Betweenness centrality is a measure introduced by Anthoniss in 1971(Freeman, 1977). It assesses the importance of a node in a network based on its role in determining the shortest paths between all pairs of nodes. According to this measure, the centrality of a particular node is calculated as the ratio of the number of shortest paths between all pairs of other nodes that pass through that node to the total number of shortest paths. Mathematically, it can be expressed as: 2 $B E_{i} = \frac{1}{(N - 1) (N - 2)} \sum_{v \neq u = 1}^{N} \frac{G_{v u} (i)}{G_{v u}}$ B{{E}_{i}}=\frac{1}{(N-1)(N-2)}\underset{v\ne u=1}{\overset{N}{\mathop \sum }}\,\frac{{{G}_{vu}}(i)}{{{G}_{vu}}} Here, G_vu(i) represents the number of shortest paths connecting nodes u and v that pass through node i, while G_vu denotes the total number of shortest paths connecting nodes u and v. Nodes with higher betweenness centrality play a critical role in connecting different subnetworks and have a significant influence on network communication. Betweenness centrality has different applications in various fields. In transportation, it helps identify critical roadways or junctions for efficient traffic management(Liu et al., 2019; Puzis et al., 2013; Wang & Cullinane, 2016). In social networks, it reveals individuals who act as bridges between disconnected groups, facilitating information flow(Kourtellis et al., 2013; Lee et al., 2021; Leydesdorff, 2007). In supply chain logistics, it aids in optimizing routes by pinpointing crucial transit points(Borgatti & Li, 2009; Wichmann & Kaufmann, 2016). Additionally, in disease spread modeling, betweenness can identify individuals likely to influence the propagation of infections(Wei et al., 2022).

Eigenvector centrality (EI):

Eigenvector centrality measures the importance of a node based on the importance of its neighboring nodes. Nodes connected to other highly important nodes are considered to have a high eigenvector centrality. Mathematically, the eigenvector centrality of node i is defined as the i_th element of the eigenvector corresponding to the largest eigenvalue of the network’s adjacency matrix(Bonacich, 1972). It can be obtained by solving the following equation: 3 $A X = λ_{m a x} X$ AX={{\text{ }\!\!\lambda\!\!\text{ }}_{max}}X Here, X(i) represents the eigenvector centrality of node i, A is the adjacency matrix of the network, and λ_max denotes the largest eigenvalue of the adjacency matrix. The eigenvector centrality captures the influence of a node’s connections, allowing us to identify nodes that have a significant impact within the network. This centrality metric is crucial in social network analysis for identifying influential individuals who are connected to other influential nodes(Li et al., 2016; Maharani, Gozali, & et al., 2014). In recommendation systems, it helps suggest relevant items by identifying users who are connected to others with similar preferences(Castillejo, Almeida, & Lpez-de-Ipina, 2012; Davoudi & Chatterjee, 2016). In biological networks, eigenvector centrality can highlight key proteins or genes with significant regulatory influence(Liseron-Monfils & Ware, 2015).

Closeness centrality (CL):

In some networks, it is crucial to identify nodes that have efficient and quick access to other nodes(Freeman & et al., 2002). Closeness centrality is a measure that quantifies how closely connected a node is to other nodes in the network. It is defined as: 4 $C L_{i} = \frac{N - 1}{\sum_{j \neq i = 1}^{N} L_{i j}}$ C{{L}_{i}}=\frac{N-1}{\mathop{\sum }_{j\ne i=1}^{N}{{L}_{ij}}} Here, L_ij represents the length of the shortest path between nodes i and j, and N is the total number of nodes in the network. Closeness centrality identifies nodes that can be reached quickly from other nodes, reflecting their proximity and potential for efficient communication. Nodes with higher closeness centrality are considered to be more central and have faster access to other nodes within the network. In social networks, closeness centrality helps identify individuals who can disseminate information or influence others most efficiently(Das, Samanta, & Pal, 2018). In transportation and infrastructure planning, it aids in locating critical nodes for efficient connectivity(Tsiotas & Polyzos, 2015; Wang & Fu, 2017). Additionally, in disease spread modeling, closeness centrality assists in pinpointing potential super-spreaders who are highly connected and can facilitate the rapid dissemination of diseases(Wei et al., 2022)

The Effect of Network Model Structure on Correlations Between Centrality Measures

This section explores the impact of network model structure on the correlation between centrality measures, using both Pearson and Spearman coefficients. Our focus extends beyond individual nodes, examining the overall similarity in node importance across different centrality measures. This analysis aligns with the concept of socio-centric networks, which encompass the entire structure of relationships within a group, revealing collective patterns and network dynamics. In contrast, egocentric networks, also known as personal or egonets, zoom in on the connections of a single individual (the ego) and their immediate contacts (alters), offering insights into their social landscape and position within the wider network. Pearson correlation and Spearman correlation are both statistical measures used to assess the relationship between two variables. Pearson correlation, also known as the Pearson correlation coefficient, quantifies the linear relationship between two continuous variables. In contrast, Spearman correlation, evaluates the strength and direction of a monotonic relationship between variables, making it suitable for both continuous and ordinal data. Spearman correlation is based on the ranks of the data rather than their actual values. There are two main approaches for determining the central node and node importance in a network. In some cases, only the relative ranking of nodes based on centrality measures is needed. To compare rankings from two different centrality measures, the Spearman correlation can be used. This focuses just on the order of the nodes, not the actual centrality values. In other cases, the specific centrality values and differences in node importance are more relevant. For comparing the actual centrality values from two measures, the Pearson correlation is more appropriate. Unlike Spearman, Pearson considers the magnitude of the centrality values, not just rank order. So Pearson correlation is useful when the centrality values themselves, not just order, need to be compared. The Pearson correlation coefficient is defined as follows: 5 $R_{x y} = \frac{\sum_{i = 1}^{N} (x_{i} - \bar{X}) (y_{i} - \bar{Y})}{N σ_{x} σ_{Y}}$ {{R}_{xy}}=\frac{\mathop{\sum }_{i=1}^{N}({{x}_{i}}-\bar{X})({{y}_{i}}-\bar{Y})}{N{{\sigma }_{x}}{{\sigma }_{Y}}} Here, X and Y represent two vectors, and N is the total number of elements in the vectors. $\bar{X}$ \bar{X} and Ȳ are the means of vectors X and Y respectively, while σ_x and σ_Y denote their standard deviations. The centrality vectors are assigned to each centrality measure, where the ith element represents the centrality value for node i. The Pearson correlation coefficient measures the linear correlation between two variables.

The Spearman correlation coefficient, on the other hand, assesses the correlation between rank variables based on the Pearson correlation. It can be calculated using the following formula, provided that all ranks are distinct integers: 6 $r_{s} = 1 - \frac{6 \sum_{i = 1}^{N} d_{i}^{2}}{N (N^{2} - 1)}$ {{r}_{s}}=1-\frac{6\underset{i=1}{\overset{N}{\mathop \sum }}\,{{d}_{i}}^{2}}{N({{N}^{2}}-1)}

In this equation, d_i represents the difference between the ranks of node i according to centrality measures X and Y (d_i=RA(X_i)-RA(Y_i)), while RA(X_i) and RA(Y_i) indicate the ranks of node i based on centrality measures X and Y respectively. Both the Pearson correlation coefficient R_xy and the Spearman correlation coefficient r_S fall within the range of −1 to 1. A correlation of 1 signifies a perfect positive linear correlation between the two measures, 0 indicates no linear correlation, and −1 suggests a completely negative linear correlation.

In this paper, we considered three network models: Random network with Erdös-Rényi model, scale-free network with Barabási-Albert model, and small-world network with Watts-Strogatz model. In general, there are two types of networks: relational event networks, where the relationships and connections between nodes can change over time, and relational state networks, where the interactions are fixed and do not change over time. Our studied networks are relational state networks because we are interested in investigating the importance of nodes in terms of some measurement so we need to fix the structure and connection between nodes(Butts, 2008; Butts et al., 2023). For each network model, there exist various structural parameters that require configuration. How might these parameters influence the correlation of centralities? To address this inquiry, we keep all parameters constant except one, and observe how the correlations evolve when that specific parameter is altered. To enhance precision, we generate 10,000 distinct network realizations for each fixed set of parameters and compute the average correlations. As a result, this averaged correlation is deemed representative of the correlation associated with the targeted parameter.

3.1.

Random Network with Erdös-Rényi Model (ER Network)

To construct a random network using the ER model, two parameters, namely N and p, are required. Here, N represents the size of the network, and each node is connected to every other node with a probability denoted by p (Newman, 2018). The degree distribution in this model follows a Poisson distribution with an average degree of p(N-1).

In order to examine the influence of the connection probability (p) on correlations, we conducted several realizations of a random network with a fixed size of N = 500. We varied p incrementally, starting from 0.03 and increasing by 0.01 until reaching 0.1, and then further increasing it by 0.05 from 0.1 up to 0.95. The results are presented in FIG. 1. We observe a strong linear correlation among all centralities, as indicated by both the correlation coefficient and the variation of the connection probability have almost no effect on it. For smaller values of p, the correlations were relatively lower. However, as p increased and approached complete network connectivity, the correlations strengthened and approached a value of 1.

Correlations between centrality measures in ER network with size of 500 as a function of $p$. (a) Pearson correlation (b) Spearman correlation.

As the second parameter, we examine the impact of network size on correlations while maintaining a fixed connection probability of p=0.1. We consider a range of network sizes, where N varies from 100 to 500 with increments of 20. FIG. 2 depicts the results, showing that as the network size increases, all correlations also increase and approach a value of 1. Similarly to our previous findings, we can observe that the size of the network has a minor influence on the correlations. On average, all centralities exhibit a high level of correlation.

Correlations between centrality measures in ER network with connection probability p=0.1 as a function of size of network. (a) Pearson correlation (b) Spearman correlation.

As outlined above the average degree of the ER network is p(N-1). Hence, our results show that an increase in the average degree leads to a corresponding rise in correlations within the ER network. Based on both correlation metrics, betweenness has the lowest correlation with closeness and eigenvector in this network. Conversely, degree demonstrates the strongest correlation with the other measures. Notably, there exists a minimum correlation of 0.85 between betweenness and closeness. Nevertheless, irrespective of the network size or connection probability, ER networks consistently demonstrate high correlations among all centrality measures. Consequently, nodes are generally ranked in a similar order based on different measures. When a node is considered crucial according to one measure, it tends to be regarded as important according to other measures as well. For instance, a node with a high degree significantly contributes to the communication between other nodes, resulting in a substantially high betweenness value. Thus, it is feasible to rank nodes and determine their importance by employing a simple measure like degree, rather than relying on complex measures, while still obtaining equivalent outcomes.

3.2.

Scale-free Network with Barabási-Albert Model (BA Network)

We will now examine the scale-free network, which is constructed using the Barabási-Albert model. The construction of this network requires two parameters: N and m. Initially, a complete network is formed with m₀ nodes. Subsequently, the remaining nodes are gradually added to the network, with each new node connecting to m < m₀ existing nodes based on the preferential attachment mechanism. In this mechanism, the probability of a new node connecting to an existing node is proportional to the degree of that node. Nodes with a higher degree are more likely to receive new connections through preferential attachment. This process continues until the network reaches a size of N (Barabsi & Albert, 1999). The degree distribution generated by the Barabási-Albert model follows a power-law distribution given by: 7 $P (k) = β k^{- γ},$ P(k)=\text{ }\!\!\beta\!\!\text{ }{{k}^{-\gamma }}, where 2 < γ < 3, and the average degree is 2m. As networks become larger, γ converges to 3. Scale-free networks are characterized by having a small number of nodes, known as hubs, that possess a significantly higher number of connections compared to the average node. These hubs are the most distinctive feature that sets scale-free networks apart from random networks.

Considering m as the first parameter, we analyze its impact on correlations through various realizations of networks with a constant size of N = 500, where m ranges from 1 to 10. The relationship between correlations and m (2m being the average degree) is illustrated in FIG. 3.

Correlations between centrality measures in BA network N=500 as a function of m (2m is average degree). (a) Pearson correlation (b) Spearman correlation.

It is evident that as m increases, all correlations rise and approach a value close to 1. However, this trend manifests differently for the two correlation criteria. In general, the Pearson correlation is higher than the Spearman correlation for all pairs of measures. Notably, the Pearson correlation between degree and closeness is the lowest, approximately 0.35. Conversely, when considering the Spearman correlation, the four pairs with the lowest correlation among the six are degree-closeness, betweenness-closeness, betweenness-eigenvector, and degree-eigenvector. The relationship between eigenvectors and closeness demonstrates a significant distinction between Pearson and Spearman correlations. The Pearson correlation between them starts at around 0.65 for m=1 and increases as m grows. On the other hand, their Spearman correlation remains close to 1 for all values of m and is unaffected by the increase in average degree. Both correlation criteria indicate a strong correlation between degree and betweenness, regardless of the value of m. Additionally, it is worth noting that the Pearson correlation between all pairs of centralities remains constant for m ≥ 6, while Spearman correlations continue to increase.

Next, we investigate the impact of the size of the BA network. To accomplish this, we create networks with varying sizes ranging from 100 to 500, while maintaining an average degree of 2 (m=1). The effect of network size on correlations is depicted in FIG. 4. Notably, the results differ significantly based on the two correlation criteria. In most cases, as the network size increases, the correlations decrease, and the measures exhibit different behavior in ranking nodes. A remarkably strong Spearman correlation close to 1 is observed between degree-betweenness and eigenvector-closeness, regardless of network size. However, other pairs of centralities exhibit a weak correlation that diminishes as the network size grows. Regarding the Pearson correlation, degree is strongly correlated with betweenness and is only slightly affected by the size of the network. It is worth mentioning that the Pearson correlation between eigenvectors and closeness decreases with the expansion of the network, unlike the Spearman correlation. The correlation between eigenvector and other measures remains consistent, but it differs entirely from the correlation between closeness, degree, and betweenness.

Correlations between centrality measures in BA network m=1 as a function of N (network size).(a) Pearson correlation (b) Spearman correlation.

Overall, in BA networks, regardless of the structure of the BA network, there is a strong correlation between degree and betweenness. This implies that nodes with a high number of connections, which is indicative of their degree, also play a critical role in communication as reflected in their betweenness. As a result, both measures yield similar rankings and importance for nodes. This strong correlation is not surprising, considering that the betweenness of a network with a power-law degree distribution follows a power-law distribution as well. Therefore, a positive correlation between these measures is statistically expected in the BA network. Another noteworthy finding is the correlation between eigenvector and closeness. The results indicate that the rankings of nodes obtained using these two measures are roughly equivalent, regardless of network size and average degree. Thus, it is possible to use one measure instead of the other without significantly altering the results. However, it is important to note that the importance assigned to nodes by each measure is influenced by the network’s properties, and caution should be exercised when interpreting them. In most cases, it is not appropriate to claim that one measure is superior to another without considering the specific properties of the network.

3.3.

Small-World Network with Watts-Strogatz Model (SW Network)

In general, SW networks exhibit a higher clustering coefficient compared to random networks. Nodes in SW networks tend to form clusters, even though the shortest paths between nodes are shorter compared to random networks. This means that nodes may not be direct neighbors, but their neighbors are likely to be connected to each other, and most nodes can be reached within a few steps from any other node. Specifically, in a SW network, the distance (L) between randomly selected nodes grows proportionally to the logarithm of the number of nodes (L ∝ log N).

The first SW network model was introduced by Watts and Strogatz(Watts & Strogatz, 1998). This model constructs a SW network based on the desired number of nodes (N), the mean degree (k) (assumed to be an even integer), and the rewiring probability (p_WS). Initially, N nodes are arranged in a circular manner, with each node connected to its k nearest neighbors. The rewiring probability p_WS is then applied to the edges. It has been demonstrated that for values of p_WS between 0.01 and 0.1, a network exhibiting the SW property can be generated.

To examine the impact of the average degree (k) on correlations, we analyze networks with N=400 and p_WS =0.3, where the average degree ranges from 2 to 30. FIG. 5 displays the correlations as a function of the average degree. It is evident that as the average degree (k) increases and nodes become connected to more neighbors, the correlations also rise. Unlike the BA network, both correlation criteria are influenced by the average degree in a similar manner. However, the correlations for k=2 differ significantly from those for k>2, except for the Spearman correlation between betweenness-degree and eigenvector-closeness. For average degrees k ≥ 4, the correlations change gradually, and the increase in average degree does not have a significant impact on the correlations.

Correlations between centrality measures in SW network with N=400 as a function of k (average degree). The rewiring probability is constant p_WS=0.3. (a) Pearson correlation (b) Spearman correlation.

The impact of network size on correlations is illustrated in FIG. 6 (k=10, p_WS=0.3). The correlations are influenced by the size of the network (N), with the exception of the Spearman correlation between degree and betweenness. As the network size grows, the rankings of nodes based on centrality measures also change. In terms of Spearman correlation, the highest correlation is observed between degree and betweenness. On the other hand, the Pearson correlation shows the highest correlation between betweenness and closeness. For Pearson correlation, the lowest correlations are associated with pairs such as eigenvector-degree, closeness-eigenvector, and betweenness-eigenvector for N=100. These correlations decrease to less than 0.2 as the network size increases. Consequently, a node that is considered critical based on its degree may not be as important when taking into account its eigenvector centrality or closeness centrality. When Spearman correlation is applied, degree exhibits a slight correlation with closeness, which decreases with network size.

Correlations between centrality measures in SW network with k=2 as a function of N (network size). The rewiring probability is constant p_WS=0.3. (a) Pearson correlation (b) Spearman correlation.

Lastly, we investigate the relationship between the correlation of centralities and the rewiring probability in the Watts-Strogatz model. To accomplish this, we keep the size and average degree constant (N=400, k=2) and vary the rewiring probability (p_WS) between 0.01 and 0.95. FIG. 7 illustrates the effect of different values of p_WS on correlations. The results reveal that the rewiring probability has a significant impact on all correlations. In general, as the rewiring probability (p_WS) increases, all correlations also increase. Both Pearson and Spearman correlations show similar responses to changes in p_WS and do not differ significantly from each other. Consequently, if correlations are calculated based on node rankings or importance, the results will be approximately the same for a given SW network. While all measures exhibit weak correlations for small values of p_WS, we expect correlations to increase as p_WS increases. This observation aligns with existing literature, which states that the Watts-Strogatz model generates a SW network for p_WS < 0.1 and approaches a random network as p_WS increases. Therefore, the strong correlation of centralities observed for large values of $p_{WS}$ can be attributed to our previous findings. It is notable that all correlations increase steadily until p_WS reaches approximately 0.6, after which they remain constant at around 1.

Correlations between centrality measures in SW network N=400 as a function of p_WS (rewiring probability). The average degree is constant 10. (a) Pearson correlation (b) Spearman correlation.

In general, the similarity between centrality measures and rankings in SW networks is strongly influenced by the structural parameters of the network. In comparison to previously studied network models, closeness and betweenness exhibit the highest correlations. This implies that nodes that are more easily accessible to other nodes tend to appear more frequently on the shortest paths. An interesting observation is the weak correlation between degree and eigenvector centrality with the other two measures. This can be attributed to the degree distribution in SW networks, where a significant number of nodes have degrees that are equal to or close to the average degree, particularly for small values of k and p_WS. This results in a correlation between a vector with many identical or similar elements and another vector. Statistically, the correlation between such vectors is found to be close to zero. However, as the rewiring probability (p_WS) increases and networks approach a random network structure, the degree distribution changes. In this case, a node with a higher degree becomes more influential when considering the other centrality measures.

The Effect of Global Properties of Network Models on the Correlation Between Centrality Measures

The aim of this section is to investigate the influence of global topological properties on the average correlations between centralities in network models. We examine properties such as density, clustering coefficient, assortativity, global efficiency, majorization gap, and spectral gap to gain insights into how network structure impacts correlations. Networks can be characterized by various global properties. Of these measures, we selected six that are commonly used to analyze networks and provide key insights. By identifying properties that can estimate correlations in networks with unknown structures, we can compare different network models based on a single characteristic. Additionally, we analyze how network construction parameters contribute to changes in centrality correlations. To accomplish this, we conduct 1,000 trials with random construction parameters for each network model. The network sizes range from 100 to 1,000, and average degrees vary between 2 and 30. For ER networks, the connectivity probabilities range from 0.01 to 0.09, while rewiring probabilities span from 0.01 to 0.2. Pearson and Spearman correlations, affected by the global properties of networks, are presented in FIG. 8 and FIG. 9, respectively. Each dot in the figures represents the average correlation between measures for a network with random parameters, calculated over six pairs of correlations. The results demonstrate that certain properties significantly impact average correlations. Notably, both Pearson and Spearman correlations exhibit almost similar trends in response to global properties. Furthermore, we provide a concise explanation of these global topological properties and their effects on average correlations. In addition, FIG. 10 illustrates how the parameters of network models influence the properties of networks.

Density:

The density of a network, represented as a fraction between 0 and 1, indicates the proportion of actual edges compared to all possible edges. For a network with N nodes and E edges, the density is calculated using the formula: 8 $ρ = \frac{2 E}{N (N - 1)} .$ \text{ }\!\!\rho\!\!\text{ }=\frac{2E}{N(N-1)}. It is generally observed that average correlations are not significantly affected by network density, particularly in ER and SW networks. However, as the density of a network increases, a higher correlation between centralities is observed. When examining the impact of network parameters on density, it is found that the size of an ER network and the rewiring probability of an SW network do not influence density. This is expected because, in ER networks, the number of edges and all possible edges change proportionally with network size. Similarly, in SW networks, there is no relationship between rewiring probability and the number of edges. Consequently, density alone is not the most reliable metric for estimating centrality correlations.

Clustering Coefficient:

In the field of graph theory, clustering coefficients are utilized to measure the tendency of nodes to cluster together within a graph. They are defined as the ratio of edges connecting neighboring nodes to all possible edges that could exist between them. A global clustering coefficient provides an average measure of node clustering in a network. Higher clustering coefficients indicate a stronger inclination of nodes to form tightly interconnected clusters. Previous research has shown that networks possessing scale-free and random properties tend to have low clustering coefficients. Conversely, networks with higher clustering coefficients have demonstrated a greater degree of correlation. However, an intriguing observation arises in the case of the SW network. While an increase in the clustering coefficient corresponds to an augmented correlation in the ER and BA networks, the SW network exhibits the opposite effect, whereby a higher clustering coefficient results in a weaker correlation. Figure~\ref{cha3} illustrates the relationship between the clustering coefficient, average degree, network size, and correlation, revealing that as the average degree increases and the network size decreases, the clustering coefficient rises in agreement with correlation patterns. Surprisingly, in the SW network, increasing the rewiring probability diminishes the clustering coefficient, contrary to its impact on correlation.

In the previous section, we observed that the rewiring probability significantly influences all correlations, more so than the other two parameters. Consequently, this observation contradicts previous findings, leading to the conclusion that the clustering coefficient cannot be relied upon as a definitive indicator of centrality correlation.

Degree assortativity:

The phenomenon wherein nodes in a network tend to connect with others that share similar characteristics is referred to as assortativity. Degree assortativity specifically measures the correlation between the degrees of connected nodes in a network. It is a form of assortativity that looks only at node degree. There are different methods for calculating degree assortativity, but a commonly used measure is the Pearson correlation coefficient between the degrees of pairs of adjacent nodes. Positive degree assortativity indicates that nodes tend to connect to other nodes with similar degrees, while negative values mean high-degree nodes tend to connect to low-degree nodes. However, the relationship between average correlations and degree assortativity is found to be nearly nonexistent. While it may be suggested that higher mean correlations are associated with higher degree assortativity values, previous studies on real-world networks contradict this observation. These studies have demonstrated that correlations actually decrease as assortativity increases. This trend is partially observed in the SW network, but in the other two networks, particularly the BA network, it is inverted. The influence of network parameters on properties further confirms this observation. Interestingly, unlike the previous section’s findings, the variation of network parameters and their impact on network properties do not align with their effect on correlations. Taking into account all of these findings, it can be concluded that degree assortativity is not a useful metric for estimating the correlation between centralities within a given network.

Global efficiency:

Global efficiency is a metric that quantifies the connectivity and efficiency of a network by measuring the average reciprocal of the shortest path length between nodes. Specifically, the efficiency between two vertices, denoted as 1/d_ij, is determined by the length of the shortest path, d_ij, between nodes i and j (where i ≠ j). The global efficiency represents the average efficiency across all pairs of nodes in the network. The concept of global efficiency has been employed in optimizing brain connectivity and transportation systems. It can be observed that an increase in global efficiency is typically accompanied by a corresponding increase in correlations on average. In networks with a global efficiency surpassing 0.4, an average correlation of around $0.9$ is commonly found. These associations between average correlations and global efficiency are particularly evident in BA and SW networks. These findings align with previous research on correlations in real-world networks, thereby validating the results. Furthermore, the results depicted in FIG. 10 demonstrate that the effects of network construction parameters on global efficiency mirror their effects on correlations. As a result, it becomes feasible to estimate correlations using this network property.

Majorization gap:

By utilizing the concept of majorization gap, a given network can be compared to a threshold network, which serves as an idealized representation of the network. Threshold graphs exhibit a property called the neighborhood-inclusion preorder, which is believed to play a role in determining centrality rankings. According to this property, if the neighbors of node u form a subset of the neighbors of node v, then node v must possess a greater or equal level of centrality compared to node u, resulting in node v dominating node u. By leveraging these dominance relationships, the neighborhood inclusion preorder arranges nodes in a ranking where non-dominated nodes are placed at the top. In a threshold network, every node is either dominated by another node or dominates other nodes. The majorization gap measures the disparity between the network under investigation and a threshold network. This gap is evaluated based on an algorithm that takes into account the number of edges that need to be rewired or added to transform the network into a threshold network. Consequently, we anticipate that in networks with a small majorization gap, all measures will exhibit high correlations, and our obtained results validate this expectation. However, as the majorization gap increases, the measures start to behave differently, resulting in a decrease in their correlation. This phenomenon is particularly pronounced in the BA network compared to other networks, which aligns with the findings of previous studies. These results are also supported by the findings presented in FIG. 10. In summary, this property proves to be suitable for estimating network correlations.

Spectral gap:

The spectral gap is another property that has been investigated. It serves as a measure of the connectivity of a network, capturing its resilience to the removal of nodes or edges while remaining connected. In mathematical terms, the spectral gap is defined as the absolute difference between the largest and second-largest eigenvalues of the adjacency matrix (|λ₁ – λ₂|), or as a normalized form $(1 - \frac{λ_{2}}{λ_{1}})$ \left( 1-\frac{{{\lambda }_{2}}}{{{\lambda }_{1}}} \right). This property exerts a significant influence on correlations across all networks examined. As the spectral gap increases, correlations also increase in proportion. Well-connected networks exhibiting a spectral gap greater than 0.5 tend to display average correlations of 0.9, indicating a strong linear correlation between centralities. Building upon our previous observations, we can deduce that structural parameters have a similar impact on both correlations and the spectral gap. For instance, when the average degree of a network increases, it becomes more prone to disconnection due to the removal of nodes or edges, resulting in a weaker average correlation between measures. Out of all the properties explored in this study, we can confidently assert that the spectral gap proves to be the most useful in estimating correlations. Its relationship with correlations remains consistent across various networks, making it a reliable indicator of the strength of correlations between centralities.

(Color online) The effect of global topological properties of network models on average Pearson correlation of centralities. The dots represent the average of six pairs of correlation between measures for a network with random parameters. Red is ER, Blue is BA, and Green is SW network.

(Color online) The effect of global topological properties of network models on average Spearman correlation of centralities. The dots represent the average of six pairs of correlation between measures for a network with random parameters. Red is ER, Blue is BA, and Green is SW network.

(Color online) The effect of construction parameters of network models on global topological properties of networks. Red is ER, Blue is BA, and Green is SW network.

Summary & Concluding Remarks

Analyzing centrality measure correlations enables using fast, simple metrics rather than complex, computationally intensive ones without compromising node importance rankings. This saves resources for large network analyses. Additionally, correlations guide appropriate centrality selection for a network structure, avoiding overlooked influential nodes from poor metric choices. Determining relationships among centralities allows more efficient, insightful analyses across applications like quantifying website impact, modeling disease spread, infrastructure resilience, and more while advancing network science techniques. In this paper, we investigated the influence of different network model structures on the Pearson and Spearman correlations between various centrality measures. We do that because the calculation of this correlation helps us to find networks when we can use a simple centrality measure with low computational complexity instead of a more complex measure that has higher computational complexity and takes more time to calculate. The network construction parameters differently impact the correlations between centrality metrics. ER network shows high Pearson and Spearman correlations between measures regardless of network structure. This consistency implies restructuring this network causes little change in centrality correlations. Thus, for example, simple degree centrality could substitute for computationally intensive betweenness centrality, providing approximated results in terms of rank and importance of nodes. However, correlations in other network types depend more heavily on structure

In BA networks, for small m (2m is the average degree), correlations are low but centralities become more correlated as the average degree increases. However, as m grows, correlations increase. We see that on average the degree and betweenness show the highest Pearson correlation and closeness and the eigenvector’s highest Spearman correlation. Unlike average degree, network size exerts an inverse effect on centrality correlations. As the network grows, all correlations diminish. Considering average degree and size jointly shows degree centrality can proxy for betweenness centrality, especially when rank order matters more than importance values (due to the higher Spearman correlation).

The correlations in SW networks heavily depend on the probability of rewiring. For small probability values, the correlations between measures are extremely low. However, as the probability increases and the network becomes more random, the correlations approach one. Similar to BA networks, the effect of network size and average degree on correlations is consistent in SW networks. Both Pearson and Spearman correlations in SW networks are affected by the probability of rewiring and average degree, although they respond differently to changes in network size. However, in this network, this is very hard to use one measure instead of another one and have the same results, especially for large networks where the rewiring probability is small.

Furthermore, the paper explores the impact of global network properties on these correlations. Our analysis revealed networks with higher density, efficiency, and spectral gap tended to have higher average correlations. However, we found an inverse relationship between the average clustering coefficient and majorization gap and centrality correlations. As the clustering and majorization gap increased, the average of correlations decreased. The impact of these global network features on correlation aligned with the effects of the network model parameters themselves. Through comparisons with previous research, it is concluded that spectral gap, global efficiency, and majorization gap are crucial characteristics that influence correlations. These characteristics can be employed to estimate correlations in networks with unknown structures. One of the limitations of this study is that we have only investigated undirected, unweighted networks. As a result, our findings regarding correlations between centrality measures may not be applicable to more intricate network types like directed, weighted, or multilayer networks. Future studies need to be conducted to determine the correlations in networks with more structural information. While we have focused on commonly used centrality measures for simple graphs, the relationships between centralities may change once directionality, edge weights, or multiple layers are incorporated. Additionally, in this study, we only considered state-oriented networks, which represent the static structure or topology of a network at a given point in time. This means that the structure of the network is fixed and, consequently, the importance of nodes cannot change. However, there is another type of network data known as relational event network data that captures the dynamic evolution of relationships over time. It involves a sequence of events where nodes and edges are created, modified, or deleted. In this case, the importance of nodes changes based on their connections, and further investigation is needed to determine if there are any relationships between centrality measures.

eISSN:: 1529-1227
Idioma:: Inglés

Calendario de la edición:: Volume Open
Temas de la revista:: Social Sciences, other

RSS Feed de revista

An Analysis of Correlation and Comparisons Between Centrality Measures in Network Models

Publicado en línea: 20 ene 2024

Páginas: 1 - 21

DOI: https://doi.org/10-21307/joss-2024-001

Palabras clave
network models, centrality measures, Pearson correlation, Spearman correlation

© 2024 Javad Mohamadichamgavi et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

An Analysis of Correlation and Comparisons Between Centrality Measures in Network Models

Publicado en línea: 20 ene 2024

Páginas: 1 - 21

DOI: https://doi.org/10-21307/joss-2024-001

Palabras clavenetwork models, centrality measures, Pearson correlation, Spearman correlation

© 2024 Javad Mohamadichamgavi et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1.

Figure 2.

Figure 3.

Figure 4.

Figure 5.

Figure 6.

Figure 7.

Figure 8.

Figure 9.

Figure 10.

Palabras clave
network models, centrality measures, Pearson correlation, Spearman correlation