César A Hidalgo is a PLOS ONE Editorial Board member. This does not alter the authors' adherence to all the PLOS ONE policies on sharing data and materials.
Conceived and designed the experiments: CAH. Performed the experiments: PS. Analyzed the data: CAH PS. Contributed reagents/materials/analysis tools: CAH PS KS. Wrote the paper: CAH KS PS. Original Idea: CAH.
A traveler visiting Rio, Manila or Caracas does not need a report to learn that these cities are unequal; she can see it directly from the taxicab window. This is because in most cities inequality is conspicuous, but also, because cities express different forms of inequality that are evident to casual observers. Cities are highly heterogeneous and often unequal with respect to the income of their residents, but also with respect to the cleanliness of their neighborhoods, the beauty of their architecture, and the liveliness of their streets, among many other evaluative dimensions. Until now, however, our ability to understand the effect of a city's built environment on social and economic outcomes has been limited by the lack of quantitative data on urban perception. Here, we build on the intuition that inequality is partly conspicuous to create quantitative measure of a city's contrasts. Using thousands of geo-tagged images, we measure the perception of safety, class and uniqueness; in the cities of Boston and New York in the United States, and Linz and Salzburg in Austria, finding that the range of perceptions elicited by the images of New York and Boston is larger than the range of perceptions elicited by images from Linz and Salzburg. We interpret this as evidence that the cityscapes of Boston and New York are more contrasting, or unequal, than those of Linz and Salzburg. Finally, we validate our measures by exploring the connection between them and homicides, finding a significant correlation between the perceptions of safety and class and the number of homicides in a NYC zip code, after controlling for the effects of income, population, area and age. Our results show that online images can be used to create reproducible quantitative measures of urban perception and characterize the inequality of different cities.
In “The Image of The City”, Kevin Lynch defines the city as a form of temporal art
Neighborhoods often differ in their demographics, such as the income and ethnicity of the people that inhabits them, but also on how safe they feel, how clean they are, how historical they look, and how lively they are, among many other evaluative dimensions
In this paper, we present a high-throughput method to quantify people's perception of cities, and their neighborhoods, and use it to measure the perceptual inequality of Boston, New York, Linz and Salzburg. The method is based on image ratings created from the pairwise comparison of images in response to evaluative questions, such as “Which place looks safer?” or “Which place looks more upper-class?” The data shows that the range of perceptions elicited by images from Boston and NYC is wider than the range of perception elicited by the images of Linz and Salzburg. Finally, we validate our measures of urban perception by studying the correlation between urban perception and homicides in New York City, finding a significant correlation between violent crime and urban perception after controlling for income, population, area and age.
We conclude that the method presented in the paper is able to capture information about a city's built environment that is relevant for the experiences of citizens, and not fully contained in income-based measures. Moreover, we conclude that these measures can be used to estimate the contrasts – or inequality – of a city's built environment with respect to these evaluative dimensions.
Cities, and their neighborhoods, are complex entities that weave together the physical components of the built environment, and the social interactions of the citizens that inhabit them. Yet, the study of cities does not belong to a unified stream of literature, but largely to two parallel branches. On the one hand, we have the literature advanced by urban planners and architects, and on the other, we have the literature advanced by social scientists and natural scientists.
The literature advanced by architects and urban planners puts special emphasis on a city's built environment. During the 20th century, the development of this literature was punctuated by a series of movements, which have resulted in cities combining different architectural and planning styles
The literature of architects and urban planners has also been active in the creation of measurements of urban perception along a number of different evaluative dimensions
Within the social sciences, the study of cities has focused mostly on the connection between demographic and economic variables, with the physical appearance of the built environment playing little or no role. The literature advanced by economists, for instance, has focused on the creation of mathematical models, such as those involved in the new economic geography of Krugman, Fujita and Venables
Natural scientists, on the other hand, have a different focus than economists, but also rely on quantitative methods that do not incorporate the aesthetic features of the cities they study. Notable examples here include the study of the fractal growth of cities
Finally, the most direct connection between these two streams of literature is the work of Jane Jacobs
The Broken Windows Theory (BWT) of Wilson and Kelling
The BWT has also been politically influential. For instance, it was cited as a justification for New York City's quality-of-life initiative
Providing evidence to prove or disprove the BWT, however, has not been easy. In fact, several observational and longitudinal studies have argued in favor and against of the BWT
In recent years, the BWT has also been linked to health. For example, cases of gonorrhea in New Orleans have been shown to correlate more strongly with an index of neighborhood disorder than with an index of neighborhood poverty
All of these studies explore the link between people's perception of urban environments and social outcomes. Yet, the focus of this literature has been mainly on the association between crime and disorder, when this is only one of the many potential associations between the urban environment and social outcomes that can be of interest. In effect, urban landscapes are complex enough to demand a number of evaluative dimensions to be characterized
We collected data on urban perception by using 4,136 geo-tagged images from four cities (# of images): New York City (1,706) and Boston (1,236) in the United States; and Salzburg (544) and Linz (650) in Austria, (
Perception data was collected using a website created for the study (
We selected the phrasing “Which place looks more X?” because it reflected more accurately what could be evaluated from an image. We note that similar questions have been asked in preceding evaluative studies (
Some limitations of the data include the constrained amount of information that is captured in an image, since other sensory channels that can affect perception, such as sound and smell, are absent in pictographic depictions. Also, variation in image quality (i.e. contrast, hue, saturation, brightness, tint and clarity), as well as the time of day, and weather conditions, can introduce additional sources of variation in the perceptions associated with a digital image. We therefore interpret the urban perception data collected through this method as a proxy for the perceptions elicited by the actual locations
Finally, we note that the mapping between images and locations is not one-to-one. In fact, for a large number of locations we captured more than one image, by pointing the camera in two or more directions. Hence, many locations are characterized by more than one quantitative value –usually two. We captured more than one image for many locations to take into account the variability of using images that are not 360-degree representations of a place, but a 90-degree wedge.
We scored each image using the fraction of times it got selected over another image, corrected by the “win” and “loss” ratios of all images with which it was compared. This correction allowed us to adjust for the “strength of schedule”
We test the inter-rater, or inter-observer reproducibility of
Finally, we test the internal consistency of the perceptions collected by looking at their transitivity. We find that the overall level of transitivity of our data is high (86.76% for safety, 87.00% for social-class, and 83.34% for uniqueness).
As a rule of thumb, we find that between 22 and 32 votes per image are needed to produce a ranking with B>75% for each of the three questions.
One important concern that needs to be addressed here is the possible biases in the measures that might come from the demographic of participants that joined the online experiment. To test for this, participants were asked to self-report age and gender after contributing five clicks. Self-reporting was high, with 97.1% of the participants providing answers for age and gender. From these, 76.0% identified themselves as male and 21.1% as female. The median self-reported age was 28 years. Finally, participants were geo-located using their IP addresses and the 7,872 unique IP addresses were located in 91 countries.
We test the significance of possible biases by comparing the Q-scores estimated using different subsets of participants. We do this for participants' age (above and below the median), gender (male and female), and location (United States vs non-United States). As controls, we show the correlations obtained for random subsets of participants of the same size (
We begin by asking whether perceptions of safety, class and uniqueness are perfectly collinear, or whether they have significant orthogonal components.
Next, we use Q to measure the contrast or inequality of urban perception. We begin this by asking: how wide is the range of perceptions elicited by the images of one city vis-a-vis another?
Linz | Salzburg | Boston | NYC | Manhattan | Queens | Brooklyn | ||
4.85 | 4.76 | 4.94 | 4.47 | 5.13 | 4.46 | 4.23 | ||
4.84 | 5.04 | 4.77 | 4.46 | 5.21 | 4.26 | 4.31 | ||
5.01 | 4.89 | 4.97 | 4.31 | 5.17 | 4.22 | 4.06 | ||
0.80 | 0.88 | 1.48 | 1.41 | 1.25 | 1.35 | 1.44 | ||
0.93 | 0.90 | 1.22 | 1.18 | 1.17 | 1.06 | 1.16 | ||
0.90 | 0.99 | 1.62 | 1.53 | 1.38 | 1.39 | 1.57 |
Difference in Means | ||||||
T-test for equal means with unequal variances. | ||||||
Safety (p-values) | ||||||
0.0482** | 0.1152 | 0.0000*** | 0.0004*** | 0.0000*** | 0.0000*** | |
0.0015*** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | ||
0.0000*** | 0.0201** | 0.0000*** | 0.0000*** | |||
0.0000*** | 0.9193 | 0.0001*** | ||||
0.0000*** | 0.0000*** | |||||
0.0028*** | ||||||
Unique (p-values) | ||||||
0.0001*** | 0.1547 | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | |
0.0000*** | 0.0000*** | 0.0342** | 0.0000*** | 0.0000*** | ||
0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | |||
0.0000*** | 0.0003*** | 0.0033*** | ||||
0.0000*** | 0.0000*** | |||||
0.4156 | ||||||
Class (p-values) | ||||||
0.0317** | 0.4844 | 0.0000*** | 0.0670* | 0.0000*** | 0.0000*** | |
0.2129 | 0.0000*** | 0.0019*** | 0.0000*** | 0.0000*** | ||
0.0000*** | 0.0291** | 0.0000*** | 0.0000*** | |||
0.0000*** | 0.2114 | 0.0002*** | ||||
0.0000*** | 0.0000*** | |||||
0.0535* | ||||||
F-test | ||||||
Safety (p-values) | ||||||
0.0257** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | |
0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | ||
0.0633** | 0.0003*** | 0.0216** | 0.4562 | |||
0.0091*** | 0.2913 | 0.4144 | ||||
0.1296 | 0.0034*** | |||||
0.1210 | ||||||
Unique (p-values) | ||||||
0.3764 | 0.0000*** | 0.0000*** | 0.0000*** | 0.0018*** | 0.0000*** | |
0.0000*** | 0.0000*** | 0.0000*** | 0.0001*** | 0.0000*** | ||
0.2511 | 0.4196 | 0.0003*** | 0.1611 | |||
0.8950 | 0.0037*** | 0.6196 | ||||
0.0445** | 0.8383 | |||||
0.0252** | ||||||
Class (p-values) | ||||||
0.0279** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | |
0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | 0.0000*** | ||
0.0164** | 0.0004*** | 0.0000*** | 0.3293 | |||
0.0257** | 0.0113** | 0.2980 | ||||
0.9122 | 0.0066*** | |||||
0.0024*** |
Significance thresholds * p<0.1 **p<0.05 ***p<0.01.
Next, we study the segregation of urban environments by asking if the places associated with similar perceptions of safety, social-class and uniqueness co-locate, and if so, to what extent. In principle, a wider range of values is observed for Boston and NYC, but these could be spatially intermixed rather than clustered. To measure the spatial segregation of perceptions we use Moran's I statistic
Getis Spatially Filtered Regression. Dependent Variable -> Log (Number of Homicides in Zip Code +1) | ||||||||||||||||
DEMOGRAPHICS | URBAN PERCEPTION | |||||||||||||||
Population and Area | Income and Age | Safety | Class | |||||||||||||
MODEL 1 | ||||||||||||||||
0.262** | 0.188* | 0.569*** | −0.419 | −0.954*** | −0.453 | −1.559** | −14.89** | |||||||||
2.298 | 2.798 | 5.416 | −0.510 | −4.820 | −0.345 | −2.486 | −2.216 | |||||||||
0.024 | 0.075 | 0.000 | 0.611 | 0.000 | 0.731 | 0.015 | 0.029 | |||||||||
MODEL 2 | 69.9% | |||||||||||||||
0.868*** | −0.599 | −0.181 | −0.220 | −0.181*** | −1.033*** | −0.109 | −0.416 | |||||||||
6.362 | −0.453 | −1.132 | −1.465 | −3.655 | −2.746 | −1.269 | −0.745 | |||||||||
0.000 | 0.651 | 0.260 | 0.146 | 0.000 | 0.007 | 0.208 | 0.458 | |||||||||
MODEL 3 | 47.8% | |||||||||||||||
0.833*** | −1.046 | −0.208 | −0.260** | −0.180*** | −0.713** | 0.045 | 0.728 | |||||||||
6.115 | −0.711 | −1.316 | −1.742 | −3.801 | −2.262 | 0.053 | 1.292 | |||||||||
0.000 | 0.479 | 0.191 | 0.085 | 0.000 | 0.026 | 0.600 | 0.200 | |||||||||
MODEL 4 | 48.3% | |||||||||||||||
0.837*** | −1.737 | −0.222 | −0.257* | −0.073 | −0.856 | −0.213** | −2.480** | −0.144 | 0.118 | 0.189* | 2.560*** | |||||
5.995 | −1.171 | −1.407 | −1.733 | −0.681 | −0.073 | −1.976 | −2.535 | −1.379 | 0.116 | 1.732 | 2.696 | |||||
0.000 | 0.245 | 0.163 | 0.086 | 0.497 | 0.484 | 0.051 | 0.013 | 0.171 | 0.908 | 0.086 | 0.008 | |||||
MODEL 5 | 52.9% | |||||||||||||||
0.392*** | 0.347*** | 0.481*** | 0.936 | −1.183*** | −2.252 | −1.545*** | −22.45*** | −0.035 | −2.717*** | −0.210*** | −1.103 | 0.033 | 3.511*** | 0.180** | 1.259* | |
3.172 | 2.957 | 4.774 | 0.803 | −5.642 | −1.228 | −2.686 | −3.033 | −0.465 | −3.089 | −2.732 | −1.341 | 0.444 | 4.487 | 2.336 | 1.917 | |
0.002 | 0.004 | 0.000 | 0.424 | 0.000 | 0.223 | 0.009 | 0.003 | 0.643 | 0.003 | 0.008 | 0.183 | 0.658 | 0.000 | 0.020 | 0.058 | |
79.4% |
The dependent variable is the logarithm–in base 10–of the number of homicides in a zip code plus one. The plus one was added to include zip codes in which the number of homicides is zero. Significance thresholds are: * p<0.1 **p<0.05 *** p<0.01.
NYC is found to be the city with the largest autocorrelation length, having all l>4.75 [km]. Boston's mean autocorrelation length for the three questions is l>2.00 [km] whereas Linz and Salzburg have characteristic lengths of 1.6 [km] or less. This shows that locations associated with similar perceptions form larger spatial clusters in NYC (
Finally, we use homicide data for NYC to look at the correlation between the urban perception of inequality and homicides. We note from the start that our intention is not to make a causal statement, but simply to use this correlation to validate the value of the information contained in our measures of urban perception. Because of the spatial nature of the dataset, we use Getis Spatially Filtered Regression (GSFR)
GSFRs solve this problem by using a transformation that filters out the spatial component of each variable x, into two estimates: one capturing the spatial variation of the variable (
Finally, a GSFR regression is an OLS regression where each variable
Model 5 explains nearly 80% of the variation of homicides across zip codes. This correlation is 10% larger than what is explained by income, age, population and area alone –from 69.88% (model
Overall, we find that in the full model (model
Finally, we notice that the regression coefficients of the safety variables are negative (safer looking, less crime), whereas those of class are positive (classier looking, more crime). As expected, coefficients of safety and class are negative when introduced individually (models
The way a city looks is of central importance for the daily experience of billions of city-dwellers. Yet until now, the availability of data about urban perception has been limited, and so has our ability to compare cities with respect to them. In this paper, we presented a method to measure urban perception and found that the cities of Boston and NYC differ from the Austrian cities of Linz and Salzburg in two important dimensions. First, the perceptions recorded for the cities of Boston and NYC are distributed more broadly than the perceptions elicited by the images from the two Austrian cities of Linz and Salzburg. Second, positive and negative perceptions cluster more strongly in the two American cities, than in their European counterparts. This means that the recorded gap between “good” and “bad” neighborhoods is larger in NYC and Boston and that both positively evaluated and negatively evaluated images cluster more in these American cities than in their Austrian counterparts. Finally, we showed that the inequality of perceptions helps explain the location of violent crime in a NYC zip code, even after controlling for income, population, area and age.
As the world gears towards building cities for hundreds of millions of individuals, the imperative of understanding cities becomes ever more important
(XLS)
(DOCX)
We would like to thank Deepak Jagdish for compiling and organizing the dataset before release. We would also thank Kiran Bhattaram, David Gelvez, Sep Kamvar, Kent Larson, Evan Marshall, Shahar Ronen, Alex Simoes, Paul Sawaya, Michael Xu and Michael Wong for their comments and expertise. We acknowledge support from the MIT Media Lab consortia, and the ABC Career Development chair.