Daily sessions on the website have a clear weekly rhythm. Marcus needs to spot days that are unusually high or low after accounting for that rhythm.
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
df = pd.read_csv("daily_sessions.csv", parse_dates=["date"]).set_index("date").sort_index()
decomp = seasonal_decompose(df["sessions"], model="additive", period=7)
decomp.plot()
plt.tight_layout()
plt.savefig("decomposition.png", dpi=150)
# Residual = actual - trend - seasonal
resid = decomp.resid.dropna()
std = resid.std()
df["expected"] = (decomp.trend + decomp.seasonal)
df["anomaly"] = (resid.abs() > 2 * std)
anomalies = df[df["anomaly"]].sort_values("sessions", ascending=False)
print(anomalies)
fig, ax = plt.subplots(figsize=(12, 4))
ax.plot(df.index, df["sessions"], lw=0.8)
ax.plot(df.index, df["expected"], lw=1.2, color="#217346", label="expected")
mask = df["anomaly"].fillna(False)
ax.scatter(df.index[mask], df["sessions"][mask], color="red", s=20, zorder=3, label="anomaly")
ax.legend()
plt.tight_layout()
plt.savefig("anomalies.png", dpi=150)
seasonal_decompose splits a series into trend, seasonality, and residual.Take any daily metric you track (web sessions, support tickets, orders). Decompose it. List the anomalous days from the past 90.