Complex networks serve as valuable tools across various scientific fields for representing real-world system characteristics. A network consists of nodes representing individuals within a population and edges representing their interactions. Over the past few decades, researchers have extensively studied different types of real-world networks, including biological networks, world-wide-web networks, and social networks(Albert, Jeong, & Barabsi, 1999; Amaral et al., 2000; Faloutsos, Faloutsos, & Faloutsos, 1999; Guelzim et al., 2002; Hncean et al., 2021; Hncean et al., 2022; Newman, 2001; Olfati-Saber, 2007; Voelkl & Kasper, 2009; Zahedian et al., 2022). Network analysis employs various metrics to understand both local and global properties. Locally, properties like node degree, clustering coefficient, and centrality describe the roles and connections at the node level. Globally, measures like efficiency, density, and average shortest path length characterize overall network structure and information flow. Together, local and global network metrics provide insights into the organization and dynamics of complex networks(Newman, 2018). Among these measures, centrality is widely used to explore networks by identifying crucial nodes and determining their relative importance(Oldham et al., 2019; Rodrigues, 2019). However, centrality cannot be defined in a singular manner, and multiple methods exist to measure the importance of nodes. Each measure is based on a different concept, and several approaches have been employed to identify influential nodes. Understanding the behavior of different centrality measures in determining importance is a fundamental question in the literature.
According to their definitions and conceptualizations, these measures can be grouped into four main categories:
In this paper we examine the correlations between centralities and the impact of different network model structures on these correlations, aiming to enhance our understanding of centrality’s potential role in studying and analyzing networks. Analyzing the relationship between centrality measures can enable us to employ less computationally demanding measures instead of other measurment in certain network configurations without compromising the accuracy of node ranking or importance assessment. For instance, we might substitute degree centrality for betweenness centrality in specific scenarios and have the same node ranking. This substitution would yield equivalent node rankings and importance evaluations while considerably reducing computational overhead. While each centrality measure possesses a conceptual underpinning and is associated with distinct social behaviors, our study delves into specific network structures where the ranking and importance of nodes as determined by one measure align with those of another(Li et al., 2015). In this study, we employ the Pearson correlation coefficient and Spearman’s rank correlation coefficient to analyze the correlations between these measures. For some networks, knowing the centrality value of each node is more crucial than its rank, and the Pearson coefficient considers the centrality values. However, in certain situations, the ranking order of nodes carries greater importance, and the Spearman correlation utilizes node ranking to calculate the correlation. We select four common centrality measures, namely degree, betweenness, eigenvector, and closeness, from a range of existing centrality measures. We chose these measures based on their widespread adoption and usage in various network-related studies and applications. These measures serve as the most used measure in the literature for quantifying the importance of nodes within networks, and they are well-established within the field of network science. The computational complexity of these centrality measures varies, with degree centrality having the lowest complexity and betweenness having the highest. Closeness centrality can be calculated faster than eigenvector centrality(Grando, Granville, & Lamb, 2018). These measures have been widely used in network analysis across various scientific disciplines. For the purpose of this study, we focus on three network models: the Erdös-Rényi model for random networks, the Barabási-Albert model for scale-free networks, and the Watts-Strogatz model for small-world networks. Creating these network models requires specific parameter settings. For instance, to create random networks according to Erdös-Rényi model, we need to specify the number of nodes and the probability of connections between them. This paper begins by aiming to investigate how these parameters affect the correlation between centrality measures. Furthermore, we examine the impact of global network properties, such as density, clustering, degree assortativity, global efficiency, majorization gap, and spectral gap, on these correlations. Our findings shed light on the relationship between network parameters and centrality measure correlations, providing valuable insights for further analysis.
Several researchers have conducted investigations to explore the relationships between different centrality measures(He, Meghanathan, & et al., 2016; Lee, 2006; Li et al., 2015; Oldham et al., 2019; Rodrigues, 2019; Ronqui & Travieso, 2015; Valente et al., 2008). For instance, Xiaojia and Meghanathan conducted a study that examined the Pearson correlations between eigenvector centrality and other measures in two network models, as well as in real-world networks(He, Meghanathan, & et al., 2016). Their analysis shows that degree centrality and eigenvector centrality have a high correlation, no matter the network type. Additionally, eigenvector centrality and betweenness are also highly correlated in random and real-world networks. Additionally, Cong Li investigated the relationship between various centrality measures across different types of real-world networks(Li et al., 2015). However, the relationship between network structure and centrality measures remains an open area of research. While past studies have examined real-world networks, there has been less focus on how systematic variations in model network structure impact centrality correlations. By tuning model parameters to generate networks with different structural properties, we can find the effects of structure on centrality relationships. In this work, we analyze a range of model networks to provide new insights into how structure influences the associations between different centrality metrics. By employing network models and manipulating their structural parameters, we can generate distinct networks with varying properties. This approach allows us to explore the intricate relationship between network structures and the behavior of centrality measures. By analyzing the correlation patterns affected by network structure, we can gain a deeper understanding of how network characteristics influence the similarity of different centrality measures. This knowledge can be utilized to identify networks where a particular centrality measure can effectively substitute another, providing equivalent node rankings. Our study aims to provide valuable insights into the intricate relationships between centrality measures in complex networks, shedding light on the network characteristics that strengthen these interrelations. By explicitly manipulating network structure, we can simulate the impact of various network properties on centrality measure correlations. This controlled experimentation allows us to identify the network features that lead to stronger correlations, a crucial step in understanding how centrality measures capture network structure and influence real-world phenomena. Our findings have significant implications for practical centrality applications in diverse networks, enabling the identification and exploitation of key nodes that play pivotal roles in maintaining network connectivity and functionality.
The paper is organized as follows. In Section 2, we provide an introduction to the centrality measures discussed in this paper and describe them in detail. Section 3 focuses on analyzing the influence of network structure on the correlations between centrality measures for the three network models under consideration. In Section 4, we delve into the examination of how the global properties of the network impact these correlations. Finally, Section 5 presents a summary of the findings and provides concluding remarks.
Centrality is a valuable concept that helps us assess the significance and relative importance of nodes within a network. Rather than having a fixed definition, the concept of centrality varies depending on the specific context in which it is applied(Rodrigues, 2019). In this paper, we have systematically selected the most commonly employed centrality measures from a wide array of literature-based options to investigate their correlation. Instead of relying on theoretical or conceptual considerations, we embrace an empirically driven approach to identify these frequently used centrality measures. This approach ensures that we incorporate measures that have demonstrated effectiveness in previous research, providing more meaningful correlations. We carefully chose prominent measures representing each centrality group for empirical analysis, prioritizing usage data rather than a theory-informed selection of less common measures. The following provides a brief explanation of the centralities examined in this study.
The degree centrality of a node is defined as the number of connections it has with other nodes in the network. In terms of the adjacency matrix, the degree of a node denoted as Betweenness centrality is a measure introduced by Anthoniss in 1971(Freeman, 1977). It assesses the importance of a node in a network based on its role in determining the shortest paths between all pairs of nodes. According to this measure, the centrality of a particular node is calculated as the ratio of the number of shortest paths between all pairs of other nodes that pass through that node to the total number of shortest paths. Mathematically, it can be expressed as:
Eigenvector centrality measures the importance of a node based on the importance of its neighboring nodes. Nodes connected to other highly important nodes are considered to have a high eigenvector centrality. Mathematically, the eigenvector centrality of node In some networks, it is crucial to identify nodes that have efficient and quick access to other nodes(Freeman & et al., 2002). Closeness centrality is a measure that quantifies how closely connected a node is to other nodes in the network. It is defined as:
This section explores the impact of network model structure on the correlation between centrality measures, using both Pearson and Spearman coefficients. Our focus extends beyond individual nodes, examining the overall similarity in node importance across different centrality measures. This analysis aligns with the concept of socio-centric networks, which encompass the entire structure of relationships within a group, revealing collective patterns and network dynamics. In contrast, egocentric networks, also known as personal or egonets, zoom in on the connections of a single individual (the ego) and their immediate contacts (alters), offering insights into their social landscape and position within the wider network. Pearson correlation and Spearman correlation are both statistical measures used to assess the relationship between two variables. Pearson correlation, also known as the Pearson correlation coefficient, quantifies the linear relationship between two continuous variables. In contrast, Spearman correlation, evaluates the strength and direction of a monotonic relationship between variables, making it suitable for both continuous and ordinal data. Spearman correlation is based on the ranks of the data rather than their actual values. There are two main approaches for determining the central node and node importance in a network. In some cases, only the relative ranking of nodes based on centrality measures is needed. To compare rankings from two different centrality measures, the Spearman correlation can be used. This focuses just on the order of the nodes, not the actual centrality values. In other cases, the specific centrality values and differences in node importance are more relevant. For comparing the actual centrality values from two measures, the Pearson correlation is more appropriate. Unlike Spearman, Pearson considers the magnitude of the centrality values, not just rank order. So Pearson correlation is useful when the centrality values themselves, not just order, need to be compared. The Pearson correlation coefficient is defined as follows:
The Spearman correlation coefficient, on the other hand, assesses the correlation between rank variables based on the Pearson correlation. It can be calculated using the following formula, provided that all ranks are distinct integers:
In this equation, di represents the difference between the ranks of node
In this paper, we considered three network models: Random network with Erdös-Rényi model, scale-free network with Barabási-Albert model, and small-world network with Watts-Strogatz model. In general, there are two types of networks: relational event networks, where the relationships and connections between nodes can change over time, and relational state networks, where the interactions are fixed and do not change over time. Our studied networks are relational state networks because we are interested in investigating the importance of nodes in terms of some measurement so we need to fix the structure and connection between nodes(Butts, 2008; Butts et al., 2023). For each network model, there exist various structural parameters that require configuration. How might these parameters influence the correlation of centralities? To address this inquiry, we keep all parameters constant except one, and observe how the correlations evolve when that specific parameter is altered. To enhance precision, we generate
To construct a random network using the ER model, two parameters, namely
In order to examine the influence of the connection probability (
Correlations between centrality measures in ER network with size of 500 as a function of $p$. (a) Pearson correlation (b) Spearman correlation.
As the second parameter, we examine the impact of network size on correlations while maintaining a fixed connection probability of
Correlations between centrality measures in ER network with connection probability p=0.1 as a function of size of network. (a) Pearson correlation (b) Spearman correlation.
As outlined above the average degree of the ER network is
We will now examine the scale-free network, which is constructed using the Barabási-Albert model. The construction of this network requires two parameters:
Considering
Correlations between centrality measures in BA network N=500 as a function of m (2m is average degree). (a) Pearson correlation (b) Spearman correlation.
It is evident that as
Next, we investigate the impact of the size of the BA network. To accomplish this, we create networks with varying sizes ranging from 100 to 500, while maintaining an average degree of
Correlations between centrality measures in BA network m=1 as a function of N (network size).(a) Pearson correlation (b) Spearman correlation.
Overall, in BA networks, regardless of the structure of the BA network, there is a strong correlation between degree and betweenness. This implies that nodes with a high number of connections, which is indicative of their degree, also play a critical role in communication as reflected in their betweenness. As a result, both measures yield similar rankings and importance for nodes. This strong correlation is not surprising, considering that the betweenness of a network with a power-law degree distribution follows a power-law distribution as well. Therefore, a positive correlation between these measures is statistically expected in the BA network. Another noteworthy finding is the correlation between eigenvector and closeness. The results indicate that the rankings of nodes obtained using these two measures are roughly equivalent, regardless of network size and average degree. Thus, it is possible to use one measure instead of the other without significantly altering the results. However, it is important to note that the importance assigned to nodes by each measure is influenced by the network’s properties, and caution should be exercised when interpreting them. In most cases, it is not appropriate to claim that one measure is superior to another without considering the specific properties of the network.
In general, SW networks exhibit a higher clustering coefficient compared to random networks. Nodes in SW networks tend to form clusters, even though the shortest paths between nodes are shorter compared to random networks. This means that nodes may not be direct neighbors, but their neighbors are likely to be connected to each other, and most nodes can be reached within a few steps from any other node. Specifically, in a SW network, the distance (
The first SW network model was introduced by Watts and Strogatz(Watts & Strogatz, 1998). This model constructs a SW network based on the desired number of nodes (
To examine the impact of the average degree (
Correlations between centrality measures in SW network with N=400 as a function of k (average degree). The rewiring probability is constant pWS=0.3. (a) Pearson correlation (b) Spearman correlation.
The impact of network size on correlations is illustrated in FIG. 6 (
Correlations between centrality measures in SW network with k=2 as a function of N (network size). The rewiring probability is constant pWS=0.3. (a) Pearson correlation (b) Spearman correlation.
Lastly, we investigate the relationship between the correlation of centralities and the rewiring probability in the Watts-Strogatz model. To accomplish this, we keep the size and average degree constant (
Correlations between centrality measures in SW network N=400 as a function of pWS (rewiring probability). The average degree is constant 10. (a) Pearson correlation (b) Spearman correlation.
In general, the similarity between centrality measures and rankings in SW networks is strongly influenced by the structural parameters of the network. In comparison to previously studied network models, closeness and betweenness exhibit the highest correlations. This implies that nodes that are more easily accessible to other nodes tend to appear more frequently on the shortest paths. An interesting observation is the weak correlation between degree and eigenvector centrality with the other two measures. This can be attributed to the degree distribution in SW networks, where a significant number of nodes have degrees that are equal to or close to the average degree, particularly for small values of k and
The aim of this section is to investigate the influence of global topological properties on the average correlations between centralities in network models. We examine properties such as density, clustering coefficient, assortativity, global efficiency, majorization gap, and spectral gap to gain insights into how network structure impacts correlations. Networks can be characterized by various global properties. Of these measures, we selected six that are commonly used to analyze networks and provide key insights. By identifying properties that can estimate correlations in networks with unknown structures, we can compare different network models based on a single characteristic. Additionally, we analyze how network construction parameters contribute to changes in centrality correlations. To accomplish this, we conduct 1,000 trials with random construction parameters for each network model. The network sizes range from 100 to 1,000, and average degrees vary between 2 and 30. For ER networks, the connectivity probabilities range from 0.01 to 0.09, while rewiring probabilities span from 0.01 to 0.2. Pearson and Spearman correlations, affected by the global properties of networks, are presented in FIG. 8 and FIG. 9, respectively. Each dot in the figures represents the average correlation between measures for a network with random parameters, calculated over six pairs of correlations. The results demonstrate that certain properties significantly impact average correlations. Notably, both Pearson and Spearman correlations exhibit almost similar trends in response to global properties. Furthermore, we provide a concise explanation of these global topological properties and their effects on average correlations. In addition, FIG. 10 illustrates how the parameters of network models influence the properties of networks.
The density of a network, represented as a fraction between 0 and 1, indicates the proportion of actual edges compared to all possible edges. For a network with In the field of graph theory, clustering coefficients are utilized to measure the tendency of nodes to cluster together within a graph. They are defined as the ratio of edges connecting neighboring nodes to all possible edges that could exist between them. A global clustering coefficient provides an average measure of node clustering in a network. Higher clustering coefficients indicate a stronger inclination of nodes to form tightly interconnected clusters. Previous research has shown that networks possessing scale-free and random properties tend to have low clustering coefficients. Conversely, networks with higher clustering coefficients have demonstrated a greater degree of correlation. However, an intriguing observation arises in the case of the SW network. While an increase in the clustering coefficient corresponds to an augmented correlation in the ER and BA networks, the SW network exhibits the opposite effect, whereby a higher clustering coefficient results in a weaker correlation. Figure~\ref{cha3} illustrates the relationship between the clustering coefficient, average degree, network size, and correlation, revealing that as the average degree increases and the network size decreases, the clustering coefficient rises in agreement with correlation patterns. Surprisingly, in the SW network, increasing the rewiring probability diminishes the clustering coefficient, contrary to its impact on correlation. In the previous section, we observed that the rewiring probability significantly influences all correlations, more so than the other two parameters. Consequently, this observation contradicts previous findings, leading to the conclusion that the clustering coefficient cannot be relied upon as a definitive indicator of centrality correlation. The phenomenon wherein nodes in a network tend to connect with others that share similar characteristics is referred to as assortativity. Degree assortativity specifically measures the correlation between the degrees of connected nodes in a network. It is a form of assortativity that looks only at node degree. There are different methods for calculating degree assortativity, but a commonly used measure is the Pearson correlation coefficient between the degrees of pairs of adjacent nodes. Positive degree assortativity indicates that nodes tend to connect to other nodes with similar degrees, while negative values mean high-degree nodes tend to connect to low-degree nodes. However, the relationship between average correlations and degree assortativity is found to be nearly nonexistent. While it may be suggested that higher mean correlations are associated with higher degree assortativity values, previous studies on real-world networks contradict this observation. These studies have demonstrated that correlations actually decrease as assortativity increases. This trend is partially observed in the SW network, but in the other two networks, particularly the BA network, it is inverted. The influence of network parameters on properties further confirms this observation. Interestingly, unlike the previous section’s findings, the variation of network parameters and their impact on network properties do not align with their effect on correlations. Taking into account all of these findings, it can be concluded that degree assortativity is not a useful metric for estimating the correlation between centralities within a given network. Global efficiency is a metric that quantifies the connectivity and efficiency of a network by measuring the average reciprocal of the shortest path length between nodes. Specifically, the efficiency between two vertices, denoted as By utilizing the concept of majorization gap, a given network can be compared to a threshold network, which serves as an idealized representation of the network. Threshold graphs exhibit a property called the neighborhood-inclusion preorder, which is believed to play a role in determining centrality rankings. According to this property, if the neighbors of node The spectral gap is another property that has been investigated. It serves as a measure of the connectivity of a network, capturing its resilience to the removal of nodes or edges while remaining connected. In mathematical terms, the spectral gap is defined as the absolute difference between the largest and second-largest eigenvalues of the adjacency matrix (|λ1 – λ2|), or as a normalized form (Color online) The effect of global topological properties of network models on average Pearson correlation of centralities. The dots represent the average of six pairs of correlation between measures for a network with random parameters. Red is ER, Blue is BA, and Green is SW network. (Color online) The effect of global topological properties of network models on average Spearman correlation of centralities. The dots represent the average of six pairs of correlation between measures for a network with random parameters. Red is ER, Blue is BA, and Green is SW network. (Color online) The effect of construction parameters of network models on global topological properties of networks. Red is ER, Blue is BA, and Green is SW network.
Figure 8.
Figure 9.
Figure 10.
Analyzing centrality measure correlations enables using fast, simple metrics rather than complex, computationally intensive ones without compromising node importance rankings. This saves resources for large network analyses. Additionally, correlations guide appropriate centrality selection for a network structure, avoiding overlooked influential nodes from poor metric choices. Determining relationships among centralities allows more efficient, insightful analyses across applications like quantifying website impact, modeling disease spread, infrastructure resilience, and more while advancing network science techniques. In this paper, we investigated the influence of different network model structures on the Pearson and Spearman correlations between various centrality measures. We do that because the calculation of this correlation helps us to find networks when we can use a simple centrality measure with low computational complexity instead of a more complex measure that has higher computational complexity and takes more time to calculate. The network construction parameters differently impact the correlations between centrality metrics. ER network shows high Pearson and Spearman correlations between measures regardless of network structure. This consistency implies restructuring this network causes little change in centrality correlations. Thus, for example, simple degree centrality could substitute for computationally intensive betweenness centrality, providing approximated results in terms of rank and importance of nodes. However, correlations in other network types depend more heavily on structure
In BA networks, for small
The correlations in SW networks heavily depend on the probability of rewiring. For small probability values, the correlations between measures are extremely low. However, as the probability increases and the network becomes more random, the correlations approach one. Similar to BA networks, the effect of network size and average degree on correlations is consistent in SW networks. Both Pearson and Spearman correlations in SW networks are affected by the probability of rewiring and average degree, although they respond differently to changes in network size. However, in this network, this is very hard to use one measure instead of another one and have the same results, especially for large networks where the rewiring probability is small.
Furthermore, the paper explores the impact of global network properties on these correlations. Our analysis revealed networks with higher density, efficiency, and spectral gap tended to have higher average correlations. However, we found an inverse relationship between the average clustering coefficient and majorization gap and centrality correlations. As the clustering and majorization gap increased, the average of correlations decreased. The impact of these global network features on correlation aligned with the effects of the network model parameters themselves. Through comparisons with previous research, it is concluded that spectral gap, global efficiency, and majorization gap are crucial characteristics that influence correlations. These characteristics can be employed to estimate correlations in networks with unknown structures. One of the limitations of this study is that we have only investigated undirected, unweighted networks. As a result, our findings regarding correlations between centrality measures may not be applicable to more intricate network types like directed, weighted, or multilayer networks. Future studies need to be conducted to determine the correlations in networks with more structural information. While we have focused on commonly used centrality measures for simple graphs, the relationships between centralities may change once directionality, edge weights, or multiple layers are incorporated. Additionally, in this study, we only considered state-oriented networks, which represent the static structure or topology of a network at a given point in time. This means that the structure of the network is fixed and, consequently, the importance of nodes cannot change. However, there is another type of network data known as relational event network data that captures the dynamic evolution of relationships over time. It involves a sequence of events where nodes and edges are created, modified, or deleted. In this case, the importance of nodes changes based on their connections, and further investigation is needed to determine if there are any relationships between centrality measures.