| id | status |
|---|---|
| 1 | Employed |
| 2 | Unemployed |
| 3 | Employed |
| 4 | Employed |
| 5 | Unemployed |
The economist’s data analysis skillset.
Some types of relationships in space
Are there fewer restaurants further from downtown Pittsburgh?
We’re going to use a Census maps and openly available data on restaurant locations to answer this question.
Census Shapefiles and Open Street MapsMaps are (typically) plots on two axis
> a basic map of Pittsburgh
Maps can show any level of detail available in the data
> a map of Pittsburgh Zipcodes
We can add information: colors
> a map of Pittsburgh Zipcode populations
We can add information: colors
> a map of Pittsburgh Zipcode populations: interactive!
Maps can also show points
> some restaurants in Pittsburgh!
Maps can also show points
Maps can also show points
Did the historical trade of enslaved people impact modern economic development in Africa?
Method: Uses historical data and the distance from major ports
Findings: Areas more disrupted by enslavement have lower GDP today, due to:
Implication: Historical shocks can have persistent economic effects.
Does the party of your neighbors impact your decision to vote?
My dissertation involved measuring distances between voters
Are there fewer restaurants further from downtown Pittsburgh?
> lets get back to our question!
Are there fewer restaurants further from downtown Pittsburgh?
Are there fewer restaurants further from downtown Pittsburgh?
Steps:
Subquestion 1: how many restaurants are in each Pittsburgh zipcode?

Subquestion 2: how far is each zipcode from downtown?
> measure from the center (centroid) of the zipcode
Subquestion 2: how far is each zipcode from downtown?
> measure from the center (centroid) of the zipcode
Subquestion 2: how far is each zipcode from downtown?
> what’s the distribution?
Subquestion 2: how far is each zipcode from downtown?
> we now have enough to answer our main question!
Are there fewer restaurants in areas further from downtown Pittsburgh?

We’re going to use data on locations (lat, lng), population, and temperature (avg_temp) of US cities to map temerature and examine whether there is a relationship between latitute (north/south) is related to temperature.
US_Cities.csv and Eastern_Cities.csvWhat 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on the employment status of individuals.
| id | status |
|---|---|
| 1 | Employed |
| 2 | Unemployed |
| 3 | Employed |
| 4 | Employed |
| 5 | Unemployed |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Barplot effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on the employment sector of individuals.
| sector |
|---|
| Tech |
| Healthcare |
| Finance |
| Tech |
| Manufacturing |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Barplot effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on the educational attainment of individuals.
| level |
|---|
| High School |
| Bachelor's |
| Master's |
| High School |
| Bachelor's |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Barplot (or histogram) effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on annual individual income.
| income |
|---|
| 45000 |
| 52000 |
| 38000 |
| 65000 |
| 41000 |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Histogram (or boxplot) effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on employment by education level.
| education | employed |
|---|---|
| High School | Yes |
| Bachelor's | Yes |
| Master's | No |
| High School | No |
| Bachelor's | Yes |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
We didn’t cover how to visualize categorical by categorical bivariate data.
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on annual individual income by education.
| education | income |
|---|---|
| High School | 35000 |
| Bachelor's | 52000 |
| Master's | 68000 |
| High School | 38000 |
| Bachelor's | 55000 |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Multi-Boxplot (or multi-linegraph-histogram) effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on annual individual income by age.
| age | income |
|---|---|
| 25 | 35000 |
| 32 | 52000 |
| 28 | 42000 |
| 45 | 68000 |
| 38 | 58000 |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Scatterplot effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on US GDP between 2015 and 2015.
| year | gdp |
|---|---|
| 2015 | 18.000000 |
| 2016 | 18.600000 |
| 2017 | 19.500000 |
| 2018 | 20.500000 |
| 2019 | 21.400000 |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Linegraph effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on inflation and unemployment after 2015.
| year | unemployment | inflation |
|---|---|---|
| 2015 | 5.300000 | 0.100000 |
| 2016 | 4.900000 | 1.300000 |
| 2017 | 4.400000 | 2.100000 |
| 2018 | 3.900000 | 2.400000 |
| 2019 | 3.700000 | 1.800000 |
Data Dimensions:
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Scatterplot (or sometimes a multilinegraph) effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Scatterplot (or sometimes a multilinegraph) effectively visualizes
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A dataset on GDP growth by country after 2020.
| country | year | gdp_growth |
|---|---|---|
| USA | 2020 | 2.200000 |
| USA | 2021 | 5.700000 |
| USA | 2022 | 2.100000 |
| USA | 2023 | 2.900000 |
| Germany | 2020 | -4.600000 |
What 1) are the dimensions of this dataset, and 2) an effecitve visualization?
A Multi-Linegraph (or sometimes a scatterplot) effectively visualizes