Introduction

The data inlcuded here comes from a project published by Urban in July 2020 that examines the relationships between built environment and health equity indicators across 72 small and medium size cities in the United States.

Variables include more standard demographic information, including population, unemployment, racial composition, renting/homeowner distribution, as well as built environment indicators like a walkability score, access to park score, and access to healthy food score (all scaled from 1-100). Additional categorical variables include city types, like Center City, Small Rural, Suburban City, or County (which in this data set accounts only for Arlington County, VA) and regions of the United States: Northeast, South, Midwest, and West.

library(tidyverse)
library(skimr)
library(readxl)
library(janitor) 
library(GGally)
library(patchwork)
library(wesanderson)
data_url <- "https://urban-data-catalog.s3.amazonaws.com/drupal-root-live/2020/07/13/Built%20Environment%20Health%20Equity%20Data%20Table.xlsx"
download.file(data_url, destfile = "urban_built_env_data.xlsx")

data_initial <- read_excel("urban_built_env_data.xlsx", 
                           sheet = "City data", range = "A2:BI72")

data_initial <- clean_names(data_initial)
write_csv(data_initial, "urban_health_env.csv")
not_numeric <- c("city", "region", "city_type", "metro_area", "population_trend")

data <- data_initial %>% 
  mutate(across(-not_numeric, as.numeric))

region_names = c("Midwest", "Northeast", "South", "West")
population_trend_names = c("Growing", "Losing Population", "Plateau")
city_type_names_all = c("Center City" , "Suburban City" , "Small Rural" , "County", "Center city" , "Small rural" , "suburban City")

data<- data %>% 
  mutate(region = factor(region, levels = region_names),
         population_trend = factor(population_trend, levels = population_trend_names),
         city_type = factor(city_type, levels = city_type_names_all))

city_type_names = c("Center City" , "Suburban City" , "Small Rural" , "County", "Center City" , "Small Rural" , "Suburban City")

data<-data %>% 
  mutate(city_type = factor(city_type, labels = city_type_names))

data<-data %>% 
  mutate(access_to_healthy_foods = (limited_access_to_healthy_foods-100)*-1)

data<-data %>% 
  mutate(average_ease_of_commute = (average_travel_time_to_work_all - 40)*-1)

First Glimpse at the Data

To gain a basic understanding of this data set with so many different interesting variables to investigate, I began my analysis with a correlelogram. This figure shows the associations between the following variables: Walkability Score, Ease of Commute, Transportation Affordability, Park Access, Access to Healthy Foods, and Proportion of Renters.

At first glance, we can recognize a number of these relationships plainly make sense (i.e. More walkable cities should be highly associated with increased access to health foods). However, it’s interesting that people in cities with lower walkability scores have an increased ease commuting. Do some cities support more human scale commuting (like walking or biking) better than others?

City Characteristics

Region

Considering that cities in the Eastern United States are generally older than cities in the West (as many were constructed before the widescale adoption of the car), it may be interesting to investigate differences across regions. In this figure, we decompose the original correlelogram into four geographical regions: Northeast, South, West, and Midwest. We might assume that the oldest cities are in the Northeast and South.

When graphing these plots, we see little no association between walkability and ease of commute in the Northeast. So perhaps proxying the city’s likelihood to be built at more of a human scale by region does not hold; or possibly that assumption does not hold for the cities included in this data set. So, let’s look at what types of cities are in each region.

Here we see that the Midwest has a higher number of Center Cities than any of the other region. We also see that the Northeast has more Suburban Cities than either the Midwest or the South. All this considered, perhaps a more worthwhile analysis could come from comparing cities by their type, either as a Center City or Suburban City.

City Type

Center Cities and Suburban Cities generally have similar associations across these indicators, with two exceptions

  1. Center Cities have a positive association between ease of commuting and transportation affordability, where Suburban Cities have a negative association and

  2. Suburban Cities have a positive association between ease of commuting and park access, where Center Cities have a negative association

Regional Differences in Center Cities

Are there differences between regions in terms of transportation affordability and park access when only look at Center Cities?

The Park Access density plot lacks uniformity among the Center Cities in different regions – each region appears to be doing its own thing. However, the Transportation Affordability density plot for Center Cities in the Northeast peaks around 17.5, where the other regions peak at 20 or above. Why might that be?

Transportation Affordability and Ease of Commute in Center Cities by Region

Here, we ultimately see that the affordability of transportation in Center Cities is associated with a higher ease of commute in all regions except the Northeast, where it appears the affordability of transportation has no correlation with ease of a commute. If I were able to collect additional data, I would be interested in collecting information on additional indicators, like number of stops in the public transit network or total area of the network’s catchment, to better understand what drives the difference between the Northeast and other regions in the United States.