Part 1 | Univariate EDA
> summarize single variablesBefore exploring relationships, we start by understanding what each variable contains on its own. Part 1 introduces the tools for visualizing and summarizing single variables. You'll learn to diagram your data and choose the appropriate summarization tool based on the data structure and variable type.
Part 1.1 | Cross-Sectional (Categorical) Data
Use bar charts to show frequencies of categorical variables. Use pie charts sparingly and generally only for binary variables.
Livestream 1.1
Class recording on January 20, 2026
Concept 1.1 // Categorical Variables
Summarizing categorical variables with bar charts and pie charts
Exercise 1.1 // Coffee Shop Locations
Visualize the locations of a small coffee shop chain
Homework 1.1
Due Friday, January 23 at 5PM on Gradescope
Part 1.2 | Cross-Sectional (Numerical) Data
Use histograms to show distributions. Use boxplots with stripplots to show quartiles and individual values.
Livestream 1.2
Class recording on January 22, 2026
Concept 1.2 // Numerical Variables
Summarizing numerical variables with histograms and boxplots
Exercise 1.2 // Customer Profiles
Create histograms and boxplots of customer data
Homework 1.2
Due Friday, January 30 (not Jan 23!) at 5PM on Gradescope
Part 1.3 | Timeseries Data
Use line plots to show how a variable changes over time. Identify trends and seasonality. Apply transformations like log, inflation adjustment, per capita, and differencing.
Livestream 1.3
Class recording from Fall 2025
Exercise 1.3 // Coffee Prices Over Time
Create line plots of coffee price data
Homework 1.3
Due Friday, January 30 at 5PM on Gradescope
Part 1.4 | Panel Data (Long Format)
Work with long format panel data where each row is an observation. Use groupby to create summary tables. Visualize with multi-line plots and facets.
Livestream 1.4b
Class recording on February 3, 2026
Livestream 1.4a
Class recording on January 29, 2026
Concept 1.4 // Panel Data (Long Format)
Groupby, multi-line plots, and faceting
Exercise 1.4 // Coffee Shop Transactions
Use groupby and multi-line plots to analyze transaction data
Part 1.5 | Panel Data (Wide Format)
Work with wide format panel data where each time period is a column. Use multi-boxplots to compare distributions across years. Use scatterplots with 45° lines to track individual changes. Reshape between formats with melt() and pivot().
Livestream 1.5
Class recording on February 5, 2026
Concept 1.5 // Panel Data (Wide Format)
Multi-boxplots, scatterplots with 45° lines, and reshaping
Exercise 1.5 // Coffee Consumption Per Capita
Compare coffee consumption across years with boxplots and scatterplots
MiniExam
MiniExam 1 covers everything in Part 1. MiniExams focus on practical application of the concepts we've developed. If you understand the concepts and feel comfortable with the exercises and homework, you'll be well prepared.
MiniExam 1 Demo (video from Fall 2025)
Practice exam covering Part 1 material. The video was recorded in Fall 2025 so the questions are similar but cover slightly different material than the current Demo below. You can find the current questions and solutions using the download links below.