HomeCourseModule 07 › Reading and writing CSV files

Reading and writing CSV files

Module 07 · Working with Files9 min readBeginner

What you'll learn

  • Read a CSV into a list of dicts (standard library)
  • Read a CSV into a DataFrame (pandas)
  • Write a DataFrame back out

Two ways: csv module and pandas

The csv module is built into Python. pandas is the third-party library we'll lean on heavily. For 95% of work, pandas is what you want. For tiny scripts or constrained environments, the csv module is fine.

The pandas one-liner

import pandas as pd
df = pd.read_csv("sales.csv")
print(df.head())

Done. You have a DataFrame.

Writing a CSV

df.to_csv("clean.csv", index=False)

index=False suppresses pandas' row-number column, which is usually what you want.

Useful read_csv options

pd.read_csv("sales.csv",
            sep=",",                  # delimiter
            encoding="utf-8",         # "latin-1" for some legacy exports
            skiprows=2,               # skip first two rows
            header=0,                 # row to use as headers
            usecols=["date","amount"],# read only these columns
            parse_dates=["date"],     # turn date column into real dates
            dtype={"id": str},        # force this column to be string
            nrows=1000,               # read only first 1000 rows
           )

The csv module (for when you can't use pandas)

import csv

with open("sales.csv") as f:
    reader = csv.DictReader(f)
    rows = list(reader)

print(rows[0])      # {'date': '2026-01-05', 'amount': '120', ...}

Every value comes back as a string. You convert as you go.

Common CSV headaches and fixes

ProblemFix
UnicodeDecodeErrorTry encoding="latin-1" or encoding="cp1252"
Numbers come in as strings (with commas)df["amount"] = df["amount"].str.replace(",","").astype(float)
Dates won't sortUse parse_dates=["col"] at read time
Wrong delimiter (semicolons or tabs)sep=";" or sep="\t"
Leading/trailing spaces in headersdf.columns = df.columns.str.strip()

Walkthrough: end-to-end CSV cleanup

Read

import pandas as pd
df = pd.read_csv("sales_raw.csv", parse_dates=["date"])

Inspect

df.head()
df.info()
df.describe()

Clean

df.columns = df.columns.str.strip().str.lower()
df = df.dropna(subset=["customer"])
df["amount"] = df["amount"].astype(float)

Save

df.to_csv("sales_clean.csv", index=False)

Key takeaways

  • pd.read_csv() is the one-liner you'll use 95% of the time.
  • The standard csv module is there if you can't use pandas.
  • Watch out for encoding, delimiters, and number columns that came in as strings.

CSV inspector

Write a small script that takes a CSV path and prints: number of rows, number of columns, the column names, and the first three rows.

📹 Video walkthrough
A video walkthrough of this lesson will be embedded here. Until then, the written walkthrough above mirrors what the video will cover step-for-step.