| Wage | EduYrs |
|---|---|
| 12 | 8 |
| 13 | 10 |
| 14 | 10 |
| 14 | 11 |
| 15 | 12 |
The economist’s data analysis skillset.

Dr. Taylor Weidman
taylorjweidman@pitt.edu | 4702 Posvar Hall
How economists do data analysis.

Dr. Taylor Weidman
taylorjweidman@pitt.edu | 4702 Posvar Hall
How economists do data analysis.

Taylor
taylorjweidman@pitt.edu | 4702 Posvar Hall
How economists do data analysis.

Taylor
taylorjweidman@pitt.edu | 4702 Posvar Hall
How economists do data analysis.



The data analysis done by economist fathers :)
The data analysis done by economists :)
Economists use data to build models and inform decisions.
Developing the data analysis pipeline used by economists.
Skillset 1. Summarize data (tables and figures).
Skillset 2: Build and interpret models (general linear model).
Skillset 3: Communicate findings (writting and presentations).
Goal: I want you to be able to build appropriate statistical models for new problems and interpret their results.
The course is divided into six parts.
Part 1: Exploring Variables
Part 2: Exploring Relationships
Part 3: Univariate General Linear Model
Part 4: Bivariate General Linear Model
Part 5: Multivariate General Linear Model
Part 6: Communicating with Data
Focus: Understanding single variables through summarization (eg. tables and figures).
Example: Analyzing a dataset of wages.
| Wage | EduYrs |
|---|---|
| 12 | 8 |
| 13 | 10 |
| 14 | 10 |
| 14 | 11 |
| 15 | 12 |

Focus: Understanding relationships between variables (eg. scatterplot).
Example: Exploring a relationship - education and wages.
| Wage | EduYrs | |
|---|---|---|
| 0 | 14 | 10 |
| 1 | 15 | 12 |
| 2 | 16 | 12 |
| 3 | 18 | 13 |
| 4 | 18 | 14 |
| 5 | 20 | 14 |
| 6 | 22 | 15 |

Focus: Sampling variation, Central Limit Theorem, and basic testing.
Example: Is the difference from $25 a real pattern or just noise?

Focus: Regression and residual analysis.
Example: Is the positive slope a real pattern or just noise?

Focus: Fixed effects, control variables, interactions.
Example: Do different groups have different relationships?

Focus: Clear narratives, effective visualization, presentation skills.
Examples: Some student work from last semester!
Software: Excel & Python
Website: ECON_0150
Optional Textbooks:
Exercises (10%)
Homework (10%)
MiniExams (1 × 20% + 1 × 15% + 1 × 10% + 1 × 5% + 1 × 0% = 50%)
Final Project (30%)
Attenance (1% extra)
Email Policy:
AI Policy:
Academic Conduct: Adhere to the Academic Integrity Code.
Lets measure hometown using distance
Lets measure hometown using distance
> quite a few people come internationally :)
> lets zoom in a bit to see more details about closer distances
Lets measure students hometown using distance
> many from Pittsburgh and the Philly area
Lets use birthyear
Lets use birthyear
> the most common birthyear was 2005
Sorry if you’re not on the list or have multiple :)
Sorry if you’re not on the list or have multiple :)
> most are Econ and many not on my limited list
It’s a prereq for the class
It’s a prereq for the class
> most generally liked it; some did not
I would suspect a positive relationship
I would suspect a positive relationship
I would expect it is
I would expect it is
Again I would expect it is
Why does economic data analysis matter?
A case study in minimum wage policy
Dr. Taylor Weidman
taylorjweidman@pitt.edu | 4702 Posvar Hall
A state raises its minimum wage.


A natural experiment.
| NJ | PA | |
|---|---|---|
| March 1992 | $4.25 | $4.25 |
| April 1, 1992 | $5.05 | $4.25 |

Supply and demand.
Equilibrium.
A binding price ceiling creates a shortage.
A binding price floor creates a surplus.
The labor market.
A minimum wage is a price floor in the labor market.
Two economists decided to find out.


Surveying fast-food restaurants before and after.
| Wave 1 | Wave 2 | |
|---|---|---|
| Timing | Feb-Mar 1992 | Nov-Dec 1992 |
| Policy | Before increase | After increase |
| Stores | 410 | 399 |
Burger King, KFC, Wendy’s, Roy Rogers in NJ and eastern PA
Before the minimum wage increase.
| New Jersey | Pennsylvania | |
|---|---|---|
| Stores surveyed | 331 | 79 |
| FTE employment | 20.4 | 23.3 |
| Starting wage | $4.61 | $4.63 |
| Wage = $4.25 | 30.5% | 32.9% |
Wages shifted to the new minimum.
Difference-in-differences.
Employment did not fall in New Jersey.
| FTE Employment | Pennsylvania | New Jersey | Difference |
|---|---|---|---|
| Before | 23.33 | 20.44 | -2.89 |
| After | 21.17 | 21.03 | -0.14 |
| Change | -2.16 | +0.59 | +2.76 |
Employment did not fall in New Jersey.
The data told a different story.
Theory prediction
Data showed
“Contrary to the central prediction of the textbook model… we find no evidence that the rise in New Jersey’s minimum wage reduced employment.”
— Card and Krueger (1994)
Let’s learn how to do this kind of analysis.
Data is a sample drawn from a process we want to understand.
\[x \sim F\]
Two skills:
Notation helps keep things organized.
We write \(x_{it}\) where:
This distinguishes substantive variables (what we measure) from index variables (how we organize).
Data structure; variable type; number of variables.
Data Structure (\(i\),\(t\))
Variable Type (\(x\))
Number of Variables (\(n\))
Which indices (\(i\), \(t\)) are active?
| What varies | Example | |
|---|---|---|
| Cross-section | Entity (i) | Household incomes in 2024 |
| Time series | Time (t) | Average US income from 1950–2024 |
| Panel | Both (i and t) | Income across households, 1950–2024 |
What are the values (\(x\))?
| Definition | Example | ||
|---|---|---|---|
| Categorical | Binary | Two categories | Employed (YES/NO) |
| Nominal | Unordered | Blood type (A, B, AB, O) | |
| Ordinal | Ordered | Education (HS, BA, MA, PhD) | |
| Numerical | Discrete | Countable | Number of children |
| Continuous | Real numbers | Household income |
Three steps, every time.
Building complexity along two axes.
| Part | Focus | Key question |
|---|---|---|
| 0 | Framework | What tools do we need? |
| 1 | Variables | What does this variable look like? |
| 2 | Relationships | How do these variables relate? |
| 3 | Univariate GLM | What can we infer about the population? |
| 4 | Bivariate GLM | How does y change with x? |
| 5 | Multivariate GLM | How does y change with x, controlling for z? |
A consistent rhythm.
| When | What | Purpose |
|---|---|---|
| Before class | Concept video | Learn core ideas at your pace |
| Start of class | Quiz | Confirm your understanding |
| During class | Exercise | Guided practice with support |
| After class | Homework | Independent practice |
Exercises are homework prep.
Let’s begin.
Q1. Identify the data structure, variable type, indices, and number of variables for dataset1.csv.
| Year | Real_GDP |
|---|---|
| 1970 | 5.316 |
| 1971 | 5.491 |
| 1972 | 5.780 |
| 1973 | 6.106 |
| … | … |
Index: \(t\) (Year)
Structure: Time series
Variable: Continuous
N Variables: Univariate
Q1. Visualize dataset1.csv.
Q2. Identify the data structure, variable type, indices, and number of variables for dataset2.csv.
| Household ID | Employment Status |
|---|---|
| D001 | Unemployed |
| D002 | Employed |
| D003 | Employed |
| D004 | Employed |
| … | … |
Index: \(i\) (Household)
Structure: Cross-sectional
Variable: Binary
N Variables: Univariate
Q2. Visualize dataset2.csv.
Q3. Identify the data structure, variable type, indices, and number of variables for dataset3.csv.
| Household ID | Year | Income | Savings |
|---|---|---|---|
| H001 | 2010 | 34,610 | 6,157 |
| H001 | 2011 | 45,560 | 2,506 |
| H001 | 2012 | 83,698 | 8,789 |
| H002 | 2010 | 52,341 | 4,123 |
| … | … | … | … |
Index: \(i\) (Household), \(t\) (Year)
Structure: Panel
Variable: Continuous
N Variables: Bivariate
Q3. Visualize dataset3.csv.
Q4. Identify the data structure, variable type, indices, and number of variables for dataset4.csv.
| ID | Economic Optimism |
|---|---|
| B001 | Somewhat Pessimistic |
| B002 | Very Pessimistic |
| B003 | Somewhat Optimistic |
| B004 | Very Optimistic |
| … | … |
Index: \(i\) (Person)
Structure: Cross-sectional
Variable: Ordinal
N Variables: Univariate
Q4. Visualize dataset4.csv.
Q5. Identify the data structure, variable type, indices, and number of variables for dataset5.csv.
| ID | Sector |
|---|---|
| A001 | Services |
| A002 | Agriculture |
| A003 | Unemployed |
| A004 | Manufacturing |
| … | … |
Index: \(i\) (Person)
Structure: Cross-sectional
Variable: Nominal
N Variables: Univariate
Q5. Visualize dataset5.csv.