1.4 Spotting BS in Data

Time: ~25 minutes

What You'll Learn

The most common ways data gets weaponized (intentionally or not)
How cherry-picking works and how to spot it
Survivorship bias and why it's everywhere
Correlation vs causation -- the most misunderstood concept in data
Simpson's Paradox and other surprises hiding in aggregated data

Key Concepts

Most bad data isn't fake -- it's real data presented in misleading ways. Sometimes this is intentional manipulation. More often, it's people who don't realize they're doing it.

This lesson is the capstone of your critical thinking toolkit:

Cherry-picking -- Selecting only the data points that support your argument. Claude will show you how the same dataset can "prove" opposite conclusions.
Survivorship bias -- Drawing conclusions from successes while ignoring failures. The classic example: "every successful CEO dropped out of college" (ignoring the millions of dropouts who didn't become CEOs).
Correlation vs causation -- Just because two things move together doesn't mean one causes the other. Ice cream sales and drowning deaths both rise in summer, but ice cream doesn't cause drowning.
Simpson's Paradox -- When a trend that appears in groups reverses when the groups are combined. This one will change how you look at any aggregated data.

Claude will walk you through real-world examples and have you practice identifying these patterns in sample datasets.

How to Start

Open Claude Desktop and say:

start lesson 1.4

Skills You'll Use Later

Bias detection (critical for evaluating AI-generated analysis in Module 2)
Questioning aggregated data (helps when breaking down reports with AI)
The "what's being left out?" framework (used in every data conversation)

1.3 Understanding Metrics 1.5 Wrap-up