- 1. Business understanding — define the questions
- 2. Data collection and loading into pandas
- 3. Data cleaning and feature engineering
- 4. EDA → visualization → insights → action
- Audit: shape, dtypes, missing values, duplicates
- Fix: fillna / dropna, astype, string normalization
- Engineer: derived columns, date features, bins
- Validate: describe() before and after each step
- Univariate: histograms, KDE, boxplots per variable
- Bivariate: scatter, regplot, correlation heatmap
- Multivariate: pair plots, faceted charts, bubble charts
- Document every insight with a one-sentence finding
- pandas — data wrangling and aggregation
- matplotlib + seaborn — static statistical charts
- plotly — interactive exploration and dashboards
- Choose the right tool for each audience and context
- Structure: context → finding → implication → action
- Lead with the insight, not the methodology
- Consistent visual style and color palette throughout
- One key message per chart — remove all distractions
- End-to-end analysis of a real business dataset
- Clean, explore, and visualize with all three libraries
- Build an interactive Plotly dashboard of key findings
- Present insights with a narrative storytelling structure