Sprawl has long been characterized as urban pathology, a signifier of unchecked development which consumes an excess of resources through land speculation and low-density dispersion. Research has convincingly linked it to such diverse phenomena as automobile-reliant travel behavior (Bento et al. 2005; Ewing and Cervero 2010), lower public transit ridership (Taylor et al. 2009), higher public infrastructure costs (Speir and Stephenson 2002), pronounced electrical energy use (Ewing and Rong 2008), increased greenhouse gas emissions (Clark 2013; Grazi, van den Bergh, and van Ommeren 2008; Hankey and Marshall 2010), localized particulate pollution and air quality (Stone 2008; Stone et al. 2007), elevated obesity rates (Zhao and Kaestner 2010), and the spatial mismatch between poor populations and employment opportunities (Covington 2009), among others (but see Brueckner 2001; Glaeser and Kahn 2004, for more positive assessments). Yet defining exactly what we should consider to be sprawl—often alternately described using the “urban” or “suburban” modifier—has proven to be more elusive than illustrating its general relationship (however measured) with these outcomes of interest.
First, I examine current measures of sprawl and highlight their strengths and weaknesses. I then explain the new index and use it to describe the morphology and recent growth trends of large metropolitan areas in the United States. I compare this measure with existing ones by examining their correlations with environmental and housing outcomes, and statistically model these relationships using both standard multivariate and first-difference regressions, offering provisional but suggestive evidence of the predictive power of the index outlined here. Finally, I conclude with a discussion of the findings and suggestions for future research.
Measuring a Nebulous Concept
While sprawl may be an intuitively simple concept in the abstract, quantification and empirical investigation are complicated by a lack of concrete parameters and the litany of different ways to gauge where, when, how, and to what relative extent it occurs. As the subject became popularized in academic discourse, researchers began conceptualizing and measuring sprawl using multidimensional frameworks which reflected the growing complexity of its theorization. Geographic Information Systems (GIS) eventually provided researchers the technological resources to incorporate spatial data into empirical measurements by offering mathematical renderings of urban morphology (e.g., “centeredness,” contiguity, etc.). Galster et al. (2001) helped pioneer this multifaceted spatial approach, initially proposing six dimensions of sprawl: density (residential units per square mile of developable land), concentration (whether housing is distributed evenly over the urban area), clustering (the degree to which development is distributed evenly within subareas), centrality (how close development is in relation to the CBD), nuclearity (whether the urban area is mono- or polycentric—that is, the number of nodes constituting robust development), and proximity (the degree to which predominantly residential or nonresidential square-mile grids are geographically close to one another). Cutsinger et al. (2005) updated and expanded the analytical approach by adding a mixed use development metric—related but distinct from proximity, as it is a measure of the jobs to housing ratio within subareas, as opposed to the distances between them—and examined the sprawl profiles for 50 metropolitan areas in the United States. The authors found that urban areas often ranked highly in some dimensions while scoring low on others, suggesting a complex portrait of metropolitan morphology and growth, and argued that whether an area can be described as sprawling depends on which factors are being considered (e.g., overall density, CBD-proximate development, etc.).
Ewing, Pendall, and Chen (2003) took a similar but more streamlined approach, calculating a composite measure which included subindices representing density, centrality, mixed land use, and street accessibility (i.e., shorter or larger city blocks). The most recent incarnation (Ewing and Hamidi 2014) adds employment and walkability data in constructing the subindices. Similar to Cutsinger et al. (2005), Ewing and Hamidi incorporate numerous variables related to spatial morphology into their measure, among them centralized development (a measure of compact monocentric growth), density gradients (how fast density declines with distance from the CBD), street accessibility (average city block size), and “centering” measures (the proportion of population and employment within CBDs and subcenters). Using standardized scores to neutralize the influence of overall population size, the authors calculate sprawl indices for counties, metropolitan areas, and urbanized areas in the United States. They also demonstrate the face validity of their measure by regressing a number of outcome variables (e.g., housing affordability, obesity rates, etc.) on the composite measure and its subindices, a rarity in the literature.
These approaches have many strengths, first and foremost the complex way they statistically render many distinct aspects of urban form. Researchers who grapple with specific empirical questions can use these indices to parse out which morphological factors most precisely associate with their outcomes of interest both theoretically and empirically, rather than necessarily having to use composite or reductive measures. The index offered by Ewing and colleagues in particular has been brought to bear on numerous research projects exploring public health and energy use outcomes, establishing a track record in the literature (e.g., Ewing and Hamidi 2014; Ewing, Pendall, and Chen 2003; Ewing and Rong 2008). Yet multidimensional spatial approaches also have their weaknesses, ranging from the relatively minor (e.g., incorporating proprietary data, like walking scores in Ewing and Hamidi’s metro land-use mix subindex) to the more significant (relying on numerous measures which quantify urban centrality or concentration, which can provide misleading results).
Incorporating measures of centrality in particular into these composite measures tacitly positions monocentric morphology (i.e., urban areas that radiate outward from a dominant CBD) as less sprawling than polycentric forms (i.e., employment and development that are multinodal rather than concentrated at one or few points within a given region). Yet there is a paucity of empirical evidence that polycentric development is, in practice, less desirable in terms of ecological, economic, or general welfare outcomes. As researchers have pointed out with respect to commuting specifically, the general balance between housing and jobs in a given subarea is a more important influence on transportation behavior than proximity to a given CBD (Buliung and Kanaroglou 2006; Modarres 2011). As Gaigné, Riou, and Thisse (2012) illustrated, there are scenarios where compact monocentric growth patterns can lead to higher total travel-related emissions than polycentric ones because of the relocation of firms and residents between and within cities, which can lead to longer and more energy-intensive trips to a single CBD rather than localized commutes to a given subcenter. These multidimensional approaches largely avoid accounting for these possibilities.
For instance, Cutsinger et al. (2005) build housing and job centrality and monocentricity into their index rather straightforwardly, all of which are conditioned on proximity to a central city hall location. Ewing and Hamidi (2014) improved upon their previous measure of centrality (see Ewing, Pendall, and Chen 2003)—a simple measure of employment shares within concentric rings around a given CBD—by accounting for employment subcenters using a nonparametric procedure developed by McMillen (2001). This ostensibly accounts for urban areas which are polycentric by identifying development nodes other than the CBD. Specifically, they measure population and employment within subareas that meet the criteria (census block groups with significant positive residuals estimated from an exponential employment density function conditioned on distance from a given CBD). Yet this only accounts for population and employment proportions within these rather geographically limited subareas. This could in practice position urban areas with concentrated nodes of population and employment surrounded by, say, detached single-family housing, as less sprawling than regions with a greater mix of employment and residential construction and overall higher-density development. Ewing and Hamidi (2014) also included the coefficient of variation of population and employment density as part of their centering subindex, arguing that greater variation among subareas around the mean within a given metro is indicative of less sprawled development.
The weakness of this measure can be best illustrated by the example of Atlanta and Los Angeles—the former at or near the bottom of every sprawl ranking, including Ewing and Hamidi’s, and the latter the general opposite. Using the mean and standard deviation of population density among developed census blocks in 2010 to calculate coefficients of variation, Atlanta has a significantly higher value than Los Angeles (1.149 and 0.831, respectively). This is not because Atlanta is less sprawled but because Los Angeles is more uniformly dense and, thus, has a smaller standard deviation relative to its (much higher) mean. In a unidimensional spatial approach, Lee et al. (forthcoming) incorporated mass transit access into their Compact City Index (CCI) and used it to examine midsize Japanese cities. The CCI is a function of the population and amenity densities surrounding mass transit stops, along with the proximity of those localized zones to the CBD (as determined by land values). Although this can be a valuable tool for assessing transit access and pedestrian-oriented urban growth, it relies on a single CBD for deriving its calculations and, thus, does account for polycentric form. Moreover, because transportation access is not deterministically driven by densities—even if it is made feasible by them—but complicated by local politics, revenues, and a number of other factors, it is less a true sprawl index than a metric of robust public transit and the composition of surrounding areas. Finally, though the CCI was designed to be more or less universally applicable, places with no centralized accounting of municipal public transportation—like the United States—can be problematic with respect to measurement.
Other more purely spatial methods for measuring sprawl rely on satellite imagery of land use and do not incorporate demography. Burchfield et al. (2006) used National Land Cover Database (NLCD) data—which assign land-use categories to 30m × 30 meter grid cells that cover the United States—to construct a sprawl index based on the percentage of undeveloped land in the square kilometer surrounding a given cell. While these data can be very useful for some applications, such as describing the total area of urban development over a bounded geography, calculating sprawl with this methodology cannot precisely account for what kinds of construction characterize a cell.3 Are developed grid cells that are geographically distant from a central city comprised of low-density single-family housing that leapfrog along a transportation corridor, or are they characterized by multifamily construction that are separated by land which is difficult, costly, or even impossible to build on? The findings cast further doubt on the methodology, which rank Memphis to be the second least sprawling metropolitan area with a population over one million in the second stage measurement (land use in 1992), with Dallas close behind. While counterintuitive results should not be dismissed out of hand, these results are likely an artifact of the research design (see Irwin and Bockstael 2007 for a more thorough critique).
Bereitschaft and Debbage (2014) likewise used NLCD data to quantify fragmented development in large urban areas in the United States. While their approach is more sophisticated, using nine different indicators of continuity and shape complexity (e.g., contiguity, landscape shape, etc.), the same weaknesses apply because the researchers do not account for the underlying composition of the urbanized patches of land. Using this measure, New Orleans and Buffalo are less sprawled than Boston or New York merely because they are more uniform in shape. Jaeger and Schwick (2014) offered a similarly sophisticated rendering of urban form and used it to construct a sprawl measure they termed Weighted Urban Proliferation (WUP), using it to explore sprawl in Switzerland. Unlike other approaches that rely on satellite data, WUP does account for total population, which offers a clearer and more comprehensive illustration of development than spatial morphology of urbanized patches of land alone. Still, the index relies on a specification of a bounded area around the land patches which the authors term the “horizon of perception,” and it is unclear how such aerial units would be determined for a larger-scale intermetro comparison, which includes places of vastly different size and underlying topography. Moreover, the incorporation of total regional population into the calculation does not address the granular composition of subareas which constitute a given urban area.
Recent work by Paulsen (2014) and Tsai (forthcoming) focused on changes in regional sprawl patterns over time. The former describes changes in housing density using four variables: Overall change in housing unit density, marginal land consumption of each new housing unit, the density of housing in newly urbanized areas, and the percentage of net new housing construction in places already urbanized. Tsai develops a sprawl index which expresses the proportion of metro population in low- and high-density subareas (i.e., the percentage of population in the top and bottom quintiles, based on subarea density distributions computed for each metro). Tsai’s measure must not necessarily be expressed dynamically, as unlike Paulsen’s land consumption approach, it is based on discrete sprawl scores calculated at different time points. Nevertheless, it pegs thresholds to regional percentile scores rather than establishing universal cut points, making it more suitable for examining changes over time within individual urban areas as opposed to illustrating the differences between them. Although both methods offer valuable tools for analyzing the changing nature of sprawl and urban development, they are less useful for deciphering these cross-sectional interurban differences.
Lopez and Hynes (2003) offered a simple density-based approach to measuring sprawl. The authors calculate the percentage of a given metropolitan area which is sorted among low-density census tracts above a rural threshold (over 200 but under 3,500 persons per square mile). This method has many advantages, from its ease of calculation and straightforward data requirements to its broad applicability and replicability. Although it leaves out other theorized dimensions of sprawl, it consequently avoids the weaknesses of spatial approaches, successfully zeroes-out rural or undeveloped land through a baseline cut point, and operates independently of regional size, municipal boundaries, and physical geography. Yet for all its strengths, it relies on one relatively modest threshold for high-density development (areas characterized by detached single-family housing can meet the 3,500 persons/m2 standard with relative ease) and uses census tracts as its subarea unit of analysis, which can be large enough to skew density estimates. I build on Lopez and Hynes’ measure in attempting to retain the strengths of the approach while minimizing these weaknesses.