Part 1 | Univariate EDA
// start by summarizing your variables //Making decisions with data involves understanding what your variables actually contain. Before exploring relationships you need to know whether you're working with categories, counts, or measurements, and what patterns, outliers, and limitations exist in your data. Data is easiest to understand when visualized appropriately and the appropriate data visualization depends on the data type. Part 1 indroduces approaches for summarizing data with figures and tables using python and spreadsheets, stills that will save you countless hours.
Part 1.0 ~ Data Types
Classify variables as categorical (binary, nominal, ordinal) or numerical (discrete, continuous). Choose appropriate summary methods and visualizations based on variable type.
Exercise 1.0 // Categorical & Numerical Variables
Synthetic data and code to summarize categorical and numerical variables
Part 1.1 ~ Categorical Variables
Use bar charts to show nominal and ordinal categorical variables. Only use pie charts to show binary categorical variables.
Exercise 1.1 // Coffee Shop Locations
The locations of a small (fictional) coffee shop chain
Part 1.2 ~ Numerical Variables
Use histograms to show distributions. Use box plots to compare quartiles and identify outliers. Calculate means, medians, and standard deviations for numerical summaries.

Exercise 1.2 // Starbucks Customers
Data on Starbucks customers from Kaggle
Homework 1.2
Due on Friday September 5 at 5PM on Gradescope
Part 1.3 ~ Data Structures
Match the visualization choice to the data structure.
Exercise 1.3 // Data Structures
Three univariate data structures: cross-sectional, time-series, and panel
Homework 1.3
Due on Friday September 5 at 5PM on Gradescope
Part 1.4 ~ Numerical Variables by Category
Create grouped box plots to compare distributions across categories.

Exercise 1.4 // Transactions by Shop Location
Coffee shop transaction data by location from Kaggle
Homework 1.4
Due on Friday September 5 at 5PM on Gradescope
Part 1.5 ~ Filtering Data
Use logical operators (AND, OR, NOT) and comparison operators (>, <, ==) to create subsets. Filter rows based on conditions to focus analysis on specific groups.

Exercise 1.5 // Starbucks Hours
Filtering Starbucks location hours data
Homework 1.5
Due on Friday September 12 at 5PM on Gradescope
Part 1.6 ~ Transforming Data
Apply mathematical transformations (log, square root) to make variabls more informative.
Exercise 1.6 // Starbucks Hours Globally
Transforming global Starbucks hours data
Homework 1.6
Due on Friday September 19 at 5PM on Gradescope
Part 1.7 ~ Grouping Data
Use group-by operations with aggregation functions (mean, sum, count) to calculate statistics by category. Create summary tables that show patterns across groups.
Exercise 1.7 // Starbucks Promotions
Analyzing promotional data by grouping
Homework 1.7
Due on Friday September 19 at 5PM on Gradescope
MiniExam
MiniExam 1 covers everything in Part 1. You will begin to learn that if you understand the concepts and do the work in the examples and homework, you're going to be in good shape on the MiniExam.
MiniExams focuses on practical application of the concepts we've developed. Practice with Homework and Exercises to prepare effectively.
MiniExam 1 Demo
MiniExam 01 Demo covers all material from Parts 1.0 through 1.7 and tests your understanding of univariate EDA concepts.