Open Access

Uncovering Correlations Between Urban Road Network Centrality and Human Mobility


Cite

Introduction

Urban spaces are typically highly localized but they are globally connected [1]. In particular, the urban space consists of local patchwork, which serve some specific functionality. Nevertheless, these patchwork are linked by the urban street network into a whole at a global scale. While the structure of urban space is greatly influenced by the history of each city [2], researchers have long been analyzing its properties in order to facilitate planning functionalities, such as resource allocation and transportation planning. Human activities in urban environments, such as business and travel, are often shaped and constrained by the geographical distance to and accessibility of the resources.

The urban street network functioning as the backbone of urban space. plays a vital role in connecting urban neighborhoods together and supporting the local/global movement in/between urban areas. Its structural properties, such as centrality and accessibility, can reveal many implications on human activities. Centrality [3], which is a network-based metric measuring the structural. importance of nodes in complex networks is often utilized to capture the importance of different parts of road networks. Former studies [4, 5] indicate that the structural properties of urban road networks as captured by the betweenness centrality can explain the observed traffic flow. Another form of centrality, closeness centrality is shown to be highly correlated with the intensity of economic activities [6] and land use [7]. Furthermore, the aggregated human travel flow on streets is shown through simulations to be mainly shaped by the underlying street structure [8].

In this work, we conduct a study on the correlation between the centrality of the urban street network and the intensity of human movement over it using data from Pittsburgh and NYC. Our results imply that different centrality metrics correlate with the intensity of human movement at different levels. The correlation strength further differs in the two cities examined.

Roadmap: Section 2 describes our analysis set up and the dataset. Section 3 presents and analyzes our experimental results, while Section 4 concludes by also discussing the future work.

Experimental Setup

In this section we will introduce the network structures that capture the intensity of human movements and the urban road network as well as the data that drive their realizations in Pittsburgh and NYC.

Human Transition Network

In the human transition network GT = (U, E), the set of nodes U is a collection of non-overlapping areas/neighborhoods in the city under examination. Further, a directed edge eij between two areas u1, u2U exists if there has been observed a transition by a city-dweller from u1 to u2. The definition of uis can be arbitrary (e.g., municipal neighborhood borders, grid etc.). In our analysis, we divide the whole city (102 miles rectangle area considered around the center of each city) into 400 neighborhood areas, each one of 0.5 miles2

In order to obtain the structure of GT for both cites we use geo-tagged social-media user-generated content. In particular, we collect Tweets using Twitter's streaming API from Jul 15 to Nov 15 2013. Each tweet has a tuple format<user Id, place Id, time, latitude, longitude>. In total, we have 526,799 geo-tagged tweets in Pittsburgh and 3,715,016 in NYC. Figure 1 presents the distribution of tweets in two cities examined. Using these data, we generate edge (transition) eijE if the same Twitter user has generated two consecutive tweets in locations liui and ljuj within a predefined time interval Δt and the distance between these two locations is greater than a threshold Δd. In our experiment, we set Δt = 4 hours and Δd = 10m. Finally, we have 172,887 such pairs in Pittsburgh and 961,671 in NYC. Note that the above definition allows for self-edges in GT. We can also annotate every edge eij with a weight, which captures the number of transitions between the two urban areas i and j.

Figure 1

Street network in selected urban areas of two cities

Centrality in GT: To capture the centrality of human movement in different neighborhoods, we calculate the PageRank [9] for each node in GT. In particular, we calculate a weighted PageRank score Pi of area i as: Pi=αΣjAijPjkjout+βi, {P_i} = \alpha \;{\Sigma _j}{A_{ij}}{{{P_j}} \over {k_j^{out}}} + {\beta _i},

Where a=0.85 and kjout k_j^{out} is the weighted out-degree of node j which counts self and outgoing edges βi is a personalized (external) priority importance for area i, which is defined as the fraction of tweets taking place in area i.

This work will also use a second simple centrality metric for GT, which is the number of tweets nt,i generated in area i. The latter does not incorporate mobility information, but rather captures the intensity of activity in each area.

Street Network

This paper will model the street network through a graph Gs =(V,S), where the set of nodes represents the intersections in the street spatial structure and an edge sij ∈ S represents the street segment that connects intersections i and j. We fetch the street networks from OpenStreetMap and process them using osm4arouting 1 into the Gs network format. osm4routing provides additional metadata such as the coordinates of each intersection, the length of each street segment and accessibility flags for each street segment in two directions(e.g accessibility by car, foot, bike etc.). Figure 2 further gives a visualization of the street networks in both cites.

Figure 2

Street network in selected urban areas of two cities

Centrality of Street Network: For a road network with n nodes and m edges, we calculate three well-established measures of node centrality: closeness centrality betweenness centrality Cb and straightness centrality Cs, Cic C_i^c caps the accessibly of node and is defined as [3]: Cic=n1j=1,jindij C_i^c = {{n - 1} \over {\sum\nolimits_{j = 1,j \ne i}^n {{d_{ij}}} }}

Where dij is the shortest path length between nodes i and j. Cib C_i^b quantifes to what extent node I serves as a “broker” betteen nodes, is formally defined as [3]: Cib=1(n1)(n2)s=1;t=1;stinnstinst C_i^b = {1 \over {\left( {n - 1} \right)\left( {n - 2} \right)}}\sum\nolimits_{s = 1;t = 1;s \ne t \ne i}^n {{{n_{st}^i} \over {{n_{st}}}}}

Where, nst is the number of shortest paths been nodes s and t while nsti n_{st}^i is the number of such shortest paths that traverse node i. Cs measures the extent to which node i can be reached directly, on a straight line, from all other nodes, which is defined as [6]: Cis=1n1j=1;jindijEucldij C_i^s = {1 \over {n - 1}}\sum\nolimits_{j = 1;j \ne i}^n {{{d_{ij}^{Eucl}} \over {{d_{ij}}}}}

Where, dEucl is the Euclidean distance between nodes i and j.

Finlly, we calculate three global and nine local indices of street centralities. The global in indices, Ccglob, Cbglob and Csglob, are calculated using the whole road network. We also consider the local version of centralities Cclocal,d, Cblocal,d and Cslocal,d, where we compute the centrality of node i considering only the nodes that are within a radius d.

Analysis setup

1 https://github.com/Tristramg/osm4routing

Our goal is to examine the relation between the central areas in a city as captured through the mobility of people, and the central areas of the city as captured through the street network. For that, we will utilize the Spearman's rank correlation coefficient ρ. In particular, the first variable for this correlation will be the PageRank centrality Pi of nodes iU (as well as nt,i). However, the centrality values that we got from the street networks are defined on a different set of nodes(set V). Thus, we will use a spatial mapping Φ: VU utilizing the lat/lon coordinates we have for every vV. With Φ in place, the second variable for calculating ρ will be the average road network centrality. C¯v* \bar C_v^* of all nodes vV that map to iU, that is, Φ(v)=i.

Results and analysis

We take the urban street network as an directed network without consideration of the traffic accessibility in two directions. Table 1 presents the correlation results for Pittsburgh and NYC. We can see that the global closeness centrality Ccglob and betweenness centrality Cbglob highly correlate with the intensity of human movement in both environments. This suggests that center areas in urban cities tend naturally to be more accessible from/to other places (higher Ccglob) and thus function as city "hubs"(higher Cbglob). In contrast the global straightness centrality Csglob, local closeness centrality Cclocal,d and local betweenness centrality Cblocal,d present no significant positive correlations. However, the straightness centrality Cslocal,d shows an interesting urban difference with a significant level of correlation in Pittsburgh but not in NYC. This is more likely due to the difference of urban space structures or travel patterns between the two cities. Further analysis is needed to sort out the exact source of this difference

Correlation ρ(*indicates a p-value<0.05; ** indicates p-value<0.01)between the street centrality and the intensity of human movement.

GT Pittsburgh NYC
C nt,i Pi nt,i Pi
Ccglob 0.610** 0.604** 0.509** 0.505**
Cbglob 0.501** 0.497** 0.459** 0.466**
Csglob 0.021 0.020 0.078 0.074
Cclocal,d=800m −0.223** −0.228** −0.085 −0.093
Cclocal,d=1600m −0.043 −0.046 0.012 0.004
Cclocal,d=2400m 0.024 0.0189 −0.044 −0.047
Cblocal,d=800m −0.001 −0.128* 0.009 −0.127*
Cblocal,d=1600m 0.017 0.026 −0.072 −0.070
Cblocal,d=2400m 0.106* 0.112* −0.014 −0.014
Cslocal,d=800m 0.348** 0.351** 0.105* 0.104*
Cslocal,d=1600m 0.410** 0.408** 0.028 0.026
Cslocal,d=2400m 0.442** 0.438** −0.031 −0.031

This research further consider the urban street network as a directed graph based on the direction accessibility for three types of movements including driving biking and walking. In this case, there are two different calculation or closeness and straightness centrality based on two types of shortest paths between nodes. The first one is outgoing shortest path dijout d_{ij}^{out} , with the direction starting from node i to node j. The second is incoming shortest path dijin d_{ij}^{in} with direction into node from node j, capturing how easily a traveler can access node i from other locations in the city. Therefore, we have in and out closeness and straightness centrality based on these two types of shortest path calculations. Table 2 presents the correlation between the centrality of directed street network and the PageRank score of neighborhood areas (results for ni,t are omitted due to space limitations). Compared to Table 1, we do not observe significant differences when considering the directed networks. This might be due to the fact that the transition network GT essentially captures the starting and ending point of a movement, ignoring the actual path followed and/or due to the high similarity of the different directed network structures. Nevertheless, there is still some significant change for global straightness centrality when considering directed street network-especially for biking and walking-which might be attributed to the fact that for these “slow modes” of transportation short geometric distance is

Correlation results by considering the road network as a directed network based on the accessibility of driving, biking and walking in either directions.

PageRank driving biking walking
C Pittsburgh NYC Pittsburgh NYC Pittsburgh NYC
Ccglob (in) 0.597** 0.473** 0.622** 0.397** 0.616** 0.393**
Ccglob (out) 0.594** 0.481** 0.623** 0.391**
Cbglob 0.481“ 0.431** 0.520** 0.452** 0.514** 0.444**
Csglob (in) −0.053 0.061 0.200** 0.301** 0.212** 0.313**
Csglob (out) −0.002 0.083 0.231** 0.303**
Cclocal,d=800m (in) −0.253** −0.143** −0.250** −0.087 −0.241** 0.042
Cclocal,d=800m (out) −0.282** −0.142** −0.253** −0.069
Cclocal,d=1600m (in) −0.133** −0.170* −0.123** −0.117* 0.103* −0.012
Cclocal,d=1600m (out) −0.067 0.003 −0.103* −0.012
Cclocal,d=2400m (in) −0.053 −0.215** −0.024 −0.178** 0.011 −0.078
Cclocal,d=2400m (out) −0.039 −0.204** −0.077 −0.171**
Cblocal,d=800m 0.042 0.100* 0.044 −0.066 0.041 −0.081
Cblocal,d=1600m 0.061 0.125* 0.072 −0.035 0.062 −0.044
Cblocal,d=2400m 0.140** 0.100* 0.161** 0.009 0.143** 0.002
Cslocal,d=800m (in) 0.248* 0.053 0.324** 0.002 0.362** 0.094
Cslocal,d=800m (out) 0.248** 0.053 0.324** 0.002
Cslocal,d=1600m (in) 0.306** 0.046 0.363** −0.020 0.396** 0.032
Cslocal,d=1600m (out) 0.306** 0.046 0.363** −0.020
Cslocal,d=2400m (in) 0.349** 0.003 0.386** −0.051 0.423** −0.023
Cslocal,d=2400m (out) 0.349** 0.003 0.386** −0.051
Discussion and Future work

In this paper we examined the correlations between the centrality of street networks with the intensity of human movement in urban areas and we found that the correlation level differs with different centrality metrics, of which some further depend on different cities. Our work provides an illuminating way to study the relationship between urban structure and human movement in a large-scale way.

We would like to emphasize that our analysis methods may suffer from a variety of biases. For example, we examine the correlation by aggregating the road network centrality and human movement in each neighborhood area, while a microscopic study might give a different view. Also, the rectangle urban area we pick may introduce edge effects on the correlation results. Furthermore, the large-scale available dataset used here may have some noises and biases. For instance, the street networks in OpenStreetMap might not that accurate especially for cities that are not that popular, since all the information is crowd sourced y the public. Also, the nature of voluntarily sharing may only give a partial information of human movement captured by geo-tagged tweets, of which the quality depends on many other factors, such as demographic biases, spam tweets and fake location information.

In the future, we plan to examine the levels of correlation by considering the temporal and contextual information of human movement such as the time and type. Furthermore, we aim to examine the centralities of a directed road network by considering the accessibility of different transportation modes (e.g., driving, biking and walking) in two directions on the street. For network centralities, we want to further investigate other practical factors, such as the max flow on a street (number of available lanes), the fastest path and the density/type of resources surrounding a street intersection.

eISSN:
2470-8038
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, other