The economist’s data analysis skillset.
Q. Is there a relationship between GDP and coffee production?
> maybe, but it’s hard to see
> lets use a two dimensional graph
Q. Is there a relationship between GDP and coffee production?
> two dimensions is nice, but the points have no meaningful relationships
Q. Is there a relationship between GDP and coffee production?
> a scatterplot effectively visualizes scross sectional data with two dimensions
Which countries have a GDP above $2 trillion?
> look at the horizontal axis and select all that are greater than 2
Which countries have a GDP above $2 trillion?
> look at the horizontal axis and select all that are greater than 2
Which countries have a production above ½ billion kg?
> and we can use either axis
Which countries have a production above ½ billion kg?
> and we can use either axis
Which countries produce less coffee per dollar than Brazil?
> we can also compare BETWEEN data points
Which countries produce less coffee per dollar than Brazil?
> we can also compare BETWEEN data points
Which countries produce less coffee per dollar than Brazil?
> separating lines can help make comparisons between ratios
Which countries produce less coffee per dollar than Brazil?
> separating lines can help make comparisons between ratios
Which countries produce more coffee per dollar than Brazil?
> separating lines can help make comparisons between ratios
Which countries produce more coffee per dollar than Brazil?
> separating lines can help make comparisons between ratios
Do the GDPs of the upper or lower pair differ by a larger amount?
> use the differences on the horizontal axis to measure differences
Which is larger: the ratio of GDPs of the upper or lower pair?
> this question is difficult to answer with this scale
Which is larger: the ratio of GDPs of the upper or lower pair?
> a log scale makes RATIOS easier to visualize: each tick is 10x larger
Which country produces the second highest output of coffee?
> a log scale also makes it easier to see SCALING
Which country produces the second highest output of coffee?
> scaling the vertical axis in logs clarifies both small and large variation
How does GDP relate to coffee production in the Americas?
> lets use a filter with this data
How does GDP relate to coffee production in the Americas?
> looks positive, but we’ll formally test this in Part 4
Which country in the Americas produces the most coffee?
> looks positive, but we’ll formally test this in Part 4
Which country in the Americas produces the most coffee?
>
Which country’s GDP is closest to Brazil’s GDP?
> we can use the horizontal axis
Which country’s GDP is closest to Brazil’s GDP?
> we can use a vertical line here to find the closest on the horizontal axis
Visualizing GDP and Coffee Production Relationships
We’re going to use a scatterplot to visually examine the relationship between coffee production and GDP.
Beans_GDP_2019.csvVisualizing GDP and Coffee Production Relationships
How do the two commodity prices relate to each other?
> difficult to tell because of the axis scale
How do the two commodity prices relate to each other?
In which years did oil and coffee prices move in opposite directions?
In which years did oil and coffee prices move in opposite directions?
But are the two prices positively or negatively related to each other?
> this is difficult to see with just a Multi-Lineplot…
But are the two prices positively or negatively related to each other?
> a Scatterplot can show the relationship between two variables through time
Does the price of oil determine the price of coffee?
> a Scatterplot can only show associations not causation :(
Visualizing Coffee Prices and Oil Prices
We’re going to use a scatterplot to visually examine the relationship between coffee prices and oil prices.
Coffee_Oil.csvVisualizing Coffee Prices and Oil Prices