Viz S12 – Section 4: Heatmaps and Correlation Analysis

🌡️ Correlation Matrix

df.corr() — Pearson by default; add method="spearman"
Values range −1 (inverse) to +1 (perfect positive)
Diagonal is always 1.0 (self-correlation)
Mask upper triangle to avoid redundant information

🔥 Heatmaps

sns.heatmap(df.corr(), annot=True, fmt=".2f")
Colormaps: coolwarm, RdBu_r, viridis
center=0 for diverging palette centered at zero
mask parameter hides upper or lower triangle

🧩 Cluster Maps

sns.clustermap(df.corr()) — hierarchical clustering
Groups similar variables and similar observations
Reveals natural structure hidden in flat heatmaps
Useful for customer segmentation and feature grouping

📋 Pivot Table Heatmaps

df.pivot_table(values, index, columns, aggfunc)
Visualize performance across two categorical dimensions
E.g., revenue by region × product category
Highlight cells with a diverging colormap

💡 Interpreting Correlations

|r| > 0.7 = strong; 0.4–0.7 = moderate; < 0.4 = weak
Multicollinearity: highly correlated predictors cause issues
Confounding variables can inflate apparent correlations
Always validate with domain knowledge and scatter plots

🧪 Lab 4 — Heatmap Analysis

Part A: Business metrics comprehensive correlation heatmap
Part B: Revenue driver deep dive with scatter validation
Part C: Marketing performance pivot table heatmap
Annotate key correlations with business interpretations