df["name"] # one column → Series
df[["name", "city"]] # multi-column → DataFrame
df[df["age"] > 30]
df[df["city"] == "NYC"]
df[df["name"].str.startswith("A")]
df[(df["age"] > 30) & (df["city"] == "NYC")] # AND — use &
df[(df["age"] < 30) | (df["city"] == "LA")] # OR — use |
df[~(df["city"] == "Chicago")] # NOT — use ~
Important: use & | ~, not and/or/not. And wrap each condition in parentheses.
df.loc[0] # first row
df.loc[0:3, "name"] # rows 0..3 (inclusive!), column 'name'
df.loc[df["age"] > 30, "name"] # name column for the over-30s
df.iloc[0] # first row by position
df.iloc[:5, 0:2] # first 5 rows, first 2 columns
df.iloc[-1] # last row
df[df["city"].isin(["NYC", "LA"])]
df[~df["city"].isin(["NYC", "LA"])] # everything not in the list
df = pd.read_csv("orders.csv")
df.columns
# Index(['order_id','date','region','customer','amount','status'])
big_north = df[(df["region"] == "North") & (df["amount"] > 1000)]
big_north[["date", "amount"]]
df.loc[(df["region"]=="North") & (df["amount"]>1000),
["date", "amount"]]
df[bool_mask] is the everyday filter pattern.&, |, ~ (with parens) when combining..loc[rows, cols] for labels; .iloc[rows, cols] for positions..isin([list]) for membership filters.From a DataFrame of employees: select everyone in Sales with a salary over $80k, returning only the name and salary columns.