Data Cleaning and Transformation
Fix dates that won't sort, names with stray spaces, half-empty columns, and every other mess that real data throws at you.
Toolkit 7 lessons ~65 min
Lessons in this module
- Handling missing values · 8 min
NaN, dropna, fillna — what's missing and what to do about it. - Finding and removing duplicates · 6 min
duplicated, drop_duplicates, and how to decide which copy to keep. - String cleanup at scale · 8 min
str.strip, str.lower, str.replace, regex — apply to a whole column. - Working with dates in pandas · 9 min
to_datetime, .dt accessor, date arithmetic, resampling. - Numeric cleanup — currencies, percentages, outliers · 7 min
Strip $ and %, clip outliers, fix obvious typos. - Reshape — melt and pivot · 8 min
Turn wide to long and back. The shape-shifting move you'll thank yourself for. - Categoricals and ordered types · 6 min
Save memory and enable ordered comparisons.