Viz S3 – Section 3: Data Structures and Pandas Fundamentals

🗄️ Pandas Series

1-D labeled array, like a single spreadsheet column
Created from lists, dicts, or NumPy arrays
Index enables fast lookup and data alignment
Vectorized operations: s * 1.1, s.mean()

📊 DataFrames

2-D table: rows × columns, like a spreadsheet
Each column is a Series sharing a common index
df.shape, df.dtypes, df.columns
Created from dicts of lists or loaded from files

📂 Loading Data

pd.read_csv() — CSV and TSV files
pd.read_excel() — Excel workbooks
Parameters: sep, header, usecols, nrows
Load directly from a URL string

🔍 Inspecting Data

df.head() / df.tail() — preview rows
df.info() — column types and null counts
df.describe() — summary statistics
df.value_counts() — frequency tables

⚡ Selection & Aggregation

Column select: df["col"] or df[["a","b"]]
Row filter: df[df["sales"] > 1000]
df.groupby("region")["sales"].sum()
df.sort_values(), df.reset_index()

🧪 Lab 3 — Pandas Data Exploration

Load a business CSV and inspect shape, dtypes, nulls
Select columns and filter rows by conditions
Compute grouped aggregations (sum, mean, count)
Sort and rank results for visualization readiness