HomeCourse › Module 10

Data Cleaning and Transformation

Fix dates that won't sort, names with stray spaces, half-empty columns, and every other mess that real data throws at you.

Toolkit 7 lessons ~65 min

Lessons in this module

  1. Handling missing values · 8 min
    NaN, dropna, fillna — what's missing and what to do about it.
  2. Finding and removing duplicates · 6 min
    duplicated, drop_duplicates, and how to decide which copy to keep.
  3. String cleanup at scale · 8 min
    str.strip, str.lower, str.replace, regex — apply to a whole column.
  4. Working with dates in pandas · 9 min
    to_datetime, .dt accessor, date arithmetic, resampling.
  5. Numeric cleanup — currencies, percentages, outliers · 7 min
    Strip $ and %, clip outliers, fix obvious typos.
  6. Reshape — melt and pivot · 8 min
    Turn wide to long and back. The shape-shifting move you'll thank yourself for.
  7. Categoricals and ordered types · 6 min
    Save memory and enable ordered comparisons.