HomeCourseModule 13 › User story: bank reconciliation

User story: bank reconciliation

Module 13 · Scenario: Finance & Accounting11 min readIntermediate

What you'll learn

  • Load two tables and align them
  • Use merge with indicator to find unmatched rows
  • Handle near-matches (same amount, different date)

The story

Priya gets two files at month-end: gl_april.xlsx (the general ledger from the accounting system) and bank_april.csv (the bank statement). She has to find every transaction on one side that isn't on the other and investigate it.

In Excel: copy both into a workbook, build a helper column of date+amount, VLOOKUP across, scroll through #N/A results. Two hours.

In Python: 25 lines. Five minutes. Below.

The script

import pandas as pd
from pathlib import Path

gl   = pd.read_excel("gl_april.xlsx")
bank = pd.read_csv("bank_april.csv")

# Normalise both sides to the same key
def keyify(df, date_col, amount_col):
    df = df.copy()
    df["date"] = pd.to_datetime(df[date_col]).dt.date
    df["amount"] = df[amount_col].round(2)
    df["key"] = df["date"].astype(str) + "_" + df["amount"].astype(str)
    return df

gl_k   = keyify(gl,   "TxnDate", "Amount")
bank_k = keyify(bank, "PostDate", "Amount")

merged = gl_k.merge(bank_k, on="key", how="outer",
                    suffixes=("_gl", "_bank"), indicator=True)

unmatched_gl   = merged[merged["_merge"] == "left_only"]
unmatched_bank = merged[merged["_merge"] == "right_only"]

with pd.ExcelWriter("reconciliation_april.xlsx") as w:
    unmatched_gl.to_excel(w,   sheet_name="GL only",   index=False)
    unmatched_bank.to_excel(w, sheet_name="Bank only", index=False)
    merged[merged["_merge"]=="both"].to_excel(w, sheet_name="Matched", index=False)

print(f"GL only:   {len(unmatched_gl)}")
print(f"Bank only: {len(unmatched_bank)}")

Handling near-matches

If amounts can be off by a small fee, allow a tolerance:

def near_match(amt_a, amt_b, tol=0.50):
    return abs(amt_a - amt_b) <= tol

# For each unmatched GL row, search bank rows within ±3 days for a near-amount match.
for _, gl_row in unmatched_gl.iterrows():
    candidates = bank[
        (bank["PostDate"] >= gl_row["TxnDate"] - pd.Timedelta(days=3)) &
        (bank["PostDate"] <= gl_row["TxnDate"] + pd.Timedelta(days=3)) &
        (bank["Amount"].apply(lambda a: near_match(a, gl_row["Amount"])))
    ]
    if len(candidates) == 1:
        print(f"Possible fuzzy match for GL {gl_row['TxnId']} ↔ Bank {candidates.iloc[0]['BankRef']}")

Video walkthrough script (for recording)

Show both raw files in Excel

"Here's the GL. Here's the bank. Look — same month, but different columns and different date formats. Reconciling these manually is a nightmare."

Open Jupyter, load both

"pd.read_excel for the GL, pd.read_csv for the bank. Two lines."

Build the match key

"Both sides get a date and a 2-decimal amount, joined as a string. That's the merge key."

Outer merge with indicator

"This is the magic line. how='outer' keeps everything from both sides; indicator=True tells us which side each row came from."

Split into three sheets and save

"Matched, GL only, Bank only — each gets its own sheet. The file lands ready to email to the controller."

Key takeaways

  • The reconciliation skeleton: build a merge key, outer-merge with indicator, split by _merge.
  • Tolerance match handles bank fees and tiny rounding differences.
  • One script, multiple sheets in the output — ready for the controller.

Your own recon

Build a recon between two of your own CSVs. The match key can be whatever uniquely identifies a transaction in both systems (txn ID, date+amount, invoice number).

📹 Video walkthrough
A video walkthrough of this lesson will be embedded here. Until then, the written walkthrough above mirrors what the video will cover step-for-step.