The authors have declared that no competing interests exist.
Conceived and designed the experiments: YL ZS. Performed the experiments: ZS CK. Analyzed the data: YL ZS CK. Contributed reagents/materials/analysis tools: SZ. Wrote the paper: YL YG.
The article revisits spatial interaction and distance decay from the perspective of human mobility patterns and spatially-embedded networks based on an empirical data set. We extract nationwide inter-urban movements in China from a check-in data set that covers half a million individuals within 370 cities to analyze the underlying patterns of trips and spatial interactions. By fitting the gravity model, we find that the observed spatial interactions are governed by a power law distance decay effect. The obtained gravity model also closely reproduces the exponential trip displacement distribution. The movement of an individual, however, may not obey the same distance decay effect, leading to an ecological fallacy. We also construct a spatial network where the edge weights denote the interaction strengths. The communities detected from the network are spatially cohesive and roughly consistent with province boundaries. We attribute this pattern to different distance decay parameters between intra-province and inter-province trips.
A number of social media websites that support geo-tagged information submission and sharing have been recently introduced and achieved great commercial success. Various functions have been provided by these websites, such as social networking (Facebook), micro-blogging (Twitter), photo sharing (Flickr), and location based check-in (Gowalla and Foursquare). Each website has millions of registered members and their submissions form an important type of big data. Since much information is user-generated and associated with particular locations, Goodchild coined the term volunteered geographical information (VGI) for it
Recently, human mobility patterns have drawn much attention in the areas of physics
In this research, we use a social media check-in data set submitted by about half millions users to study the inter-urban trip patterns. At the collective level, these trips represent spatial interaction strengths between cities. Our research serves three purposes. First, we intend to reveal the underlying distance effect in the trips extracted from check-in records. Second, we try to link patterns at the collective level of spatial interactions versus the individual level of human movements, and to make a comparison with intra-urban patterns revealed from mobile phone or taxi data sets. Last, we investigate the implications of distance decay effect in regionalizing the study area based on spatial interactions between cities.
This section summarizes research in three areas: spatial interaction, human mobility pattern, and spatially-embedded network. The first is a fundamental topic in geographical applications, and the last two have recently drawn much attention in both geographical and physical studies, with the availability of spatio-temporally-tagged big data. This research reveals the underlying connections among them using empirical data set.
Spatial interactions between geographical entities such as cities and regions help us to understand spatial structure of a region and plan an efficient spatial configuration. In practice, interaction strength can be measured by volumes of passengers
Most spatial interaction systems are governed by the distance decay effect
A number of practical methods have been developed for fitting the gravity model, including linear programming
Understanding human mobility patterns can help us in many fields including epidemic control and traffic management
A number of measurements can be used to quantify human mobility patterns
Various models have been proposed to interpret the observed human mobility patterns. They takes into account different influencing aspects such as population characteristics
Given a set of geographical entities with known interaction strengths between them, we can construct a spatially-embedded network (or spatial network), in which each node is located in space so that the distance between each two nodes can be measured
In complex network analyses, detecting communities is an important task. Given a network, a community is a subset with relatively dense node-to-node connections. Many algorithms have been proposed for detecting communities, including the Girvan-Newman method
For a spatial network, a community corresponds to a region, which may be spatially connected or disconnected (i.e. with enclaves). Community detection methods are therefore extended to take into account specific spatial characteristics, such as adjacency constraint
This research uses a check-in data set collected from a major Chinese LBSNS (location-based social network service) provider, which can be viewed as a counterpart of Foursquare in the western world. We obtained the data set due to the collaboration between our laboratory (
(A) The map, created using density estimation, clearly depicts the distributions of cities and transportation networks in China. Note that The South China Sea Islands are not shown for simplicity. (B) As shown by the CCDF (complementary cumulative distribution function), the frequency distribution exhibits a heavy tail characteristic. Shanghai and Beijing, the two biggest cities in China, have the most check-in records.
Given a user, his or her trajectory can be formalized as {<City1, T1>, <City2, T2>, …, <City
For each user, we compute the number check-ins,
The inter-urban movements extracted from check-in records are associated with representativeness issues. In other words, not all individuals are registered users of a LBSNS. According to the statistics of Foursquare (
(A) Scatter plot of
At present, most human mobility research is conducted based on a large population instead of a single person
From the extracted trajectories, we can compute both the check-in number for each city and the movement between each two cities. An undirected weighted network, denoted by
(A) Interaction map of the 370 cities. The red lines indicate stronger interactions. The maximum value is 137,847, which is the number of trips between Shanghai and Suzhou, extracted from the check-in data set. The red dots represent capital cities of provinces in China. (B) Complementary cumulative distribution of edge weights (or interaction strengths) between cities.
The edge weights follow a power law distribution (
In this research, we quantitatively estimate the distance decay effect by fitting the gravity model. Because of the low graph density, we adopt the PSO method to find the best fit. According to the PSO method, we try different β values, from 0.1 to 2.0 with a step of 0.1, in the gravity model. The goodness of fit (GOF) is measure using the correlation coefficient between the observed and estimated interactions. For each fixed β value, say 1.0, the PSO method is used to search the best GOF, where each particle is a 370-dimensional vector denoting the theoretical sizes of all cities.
The maximum GOF = 0.985 is achieved when β = 0.8. The exponent is close to the value observed from air passenger flows in China
The inset depicts the correlation in a log-log scale. Note that the estimated interaction strengths for some city pairs are less than 1 and thus negative values exist in the log-log plot.
Some research has pointed out shortcomings of the gravity model
As pointed in Section 3.2, check-in data only partially capture inter-urban movements and there exist sampling biases. Sampling biases also exist in the air passenger data or even a data set collected based upon other transport modes (e.g. railway) as users of different modes are often correlated with their socio-economic attributes. A single data set represents one aspect of human trips and thus they might not be consistent with each other. It is interesting that the check-in data and the air travel data can be well fitted by different gravity models with different distance parameters and theoretical size sets.
Given two known data sets for the same group of places, we can obtain two gravity models, denoted by
In terms of the human mobility pattern, the displacement distribution can be well fitted by an exponential distribution
The observed displacements follow an exponential distribution. We can find a small peak when
The exponential displacement distribution is seemingly inconsistent with the power law distance decay, which implies a slower distance decay effect. Liu et al.
As mentioned earlier, inter-urban (or region) interaction is a traditional topic in geographical analyses, while human mobility patterns have recently drawn much attention thanks to the availability of big trajectory data. This research indicates that the aggregate level of spatial interactions and individual level of movements can be viewed as two sides of the same coin. If the collective spatial interactions can be interpreted by the gravity model (
It should be pointed out that an ecological fallacy exists in extending collective level statistics to the individual level. Although various existing models, including the gravity model in this research, closely reproduce the observed displacement distribution, it is still questionable that each individual’s movements follow the same gravity model.
(A) The four persons’ movements are all influenced by the distance decay effect. (B) Distance decay effect is not clear for each person. However, the four persons’ movements collectively exhibit the distance decay effect.
For a spatially-embedded network, the community detection method can help us to reveal its structure. In this research, we create a Voronoi diagram based on the 370 cities and merge Voronoi polygons containing cities in the same community so that all communities can be spatialized and visualized. The multilevel algorithm developed by Blondel et al.
We run the multilevel algorithm 20 times, each of which yields a partition. By merging the Voronoi polygons of cities in the same community, a partition can be visualized. Regions with thicker borders indicate that they occur in more partitions.
From
The partition pattern has been observed from various spatial networks
The yellow rectangles and gray circles represent interactions between cities in one province and two different provinces, respectively. It is clear that the gravity model underestimates intra-province trips.
Human mobility patterns have been a hot research topic in many areas. However, existing studies do not differentiate movements at different spatial scales. Particularly, due to the data limitation, little literature has investigated nationwide inter-urban trips. For the first time, this research adopts the check-in data to analyze inter-urban movements. Our findings include the following four aspects. First, the inter-urban displacements follow an exponential distribution and do not have a heavy-tail property. This distribution is similar to that observed in intra-urban movements. Liu et al. suggested that the geographical environment is a reason for the thin tail in intra-urban displacement distributions. For inter-urban trips
Second, the spatial interactions reflected by the check-in data can be well fitted by the gravity model. This confirms again the power law distance decay effect in spatial interactions, which has been observed from many different data sets. Some existing research has argued that the gravity model cannot well predict spatial interactions if the place populations are directly used as the masses in the model. This research, on the contrary, illustrates that fitting the gravity model to estimate both the places’ theoretical sizes and the distance decay function is an appropriate approach.
Third, this research points out the connection between spatial interactions and human mobility patterns. The distance decay function
Last, by constructing a spatially-embedded network from the check-in data, we regionalize China’s territory using a community detection method. The result exhibits a similar pattern to previous studies, in which most communities are spatially consecutive and coincide with geographical units (provinces in the case of this research). Such patterns can also be attributed to the distance decay effect that generally influences closer cities to form stronger connections and thus be clustered together. We also find a difference between the distance decay effects in intra-province and inter-province trips. It is this difference that makes interactions between cities in the same province relatively stronger and therefore classified into the same community.
Human mobility patterns and spatially-embedded networks have drawn much attention in recent complexity science studies, where much literature focuses on finding the underlying geographical impacts. Meanwhile, spatial interactions in different spatial scales are widely investigated in geographical analyses. Distance obviously plays an important role in human mobility patterns, spatial interactions, and spatially-embedded networks. The distance decay effect decreases the probabilities of long-distance movements as well as the interaction strengths between faraway places, and consequently shapes the topological structures of spatial networks. Based on an empirical data set, this research makes an initial effort to bridge the three concepts using the distance decay effect. Inversely, with the rapid development of complexity science, human mobility patterns and spatially-embedded networks provide a new perspective and new tools to revisit conventional geographical analyses. This is especially valuable in the era of big data since it is becoming easier for us to collect various data for representing movements, measuring interactions, and constructing spatial networks.
(XLSX)
(XLSX)
We thank F. Wang, D. Tong, and L. Yin for useful comments, J. Wang for providing the flight passenger data, X. Liu for running the community analysis, and N. Henry for editing the manuscript.