Viz S10 – Section 2: Distribution and Relationship Visualizations

🔵 Scatter Plots in Seaborn

sns.scatterplot(data=df, x="a", y="b", hue="cat")
Encode extra dimensions: size, style, hue
Identify clusters and outliers in business data
Built-in legend from hue, size, and style parameters

📏 Regression Plots

sns.regplot(x, y) — scatter + linear fit + CI band
sns.lmplot() — regplot with category faceting
Confidence band width reflects uncertainty of the fit
order=2 for polynomial regression curves

🔲 Pair Plots

sns.pairplot(df, hue="segment")
All pairwise scatter plots + diagonal histograms
Instantly reveals which variable pairs correlate
Use vars=["a","b","c"] to limit columns shown

🔀 Joint Plots

sns.jointplot(data=df, x="a", y="b", kind="hex")
Marginal distributions on both x and y axes
Kinds: scatter, kde, hex, hist, reg
Best for deep bivariate analysis of two variables

📐 Correlation Analysis

Pearson r: linear relationship strength (−1 to +1)
df.corr() computes the full correlation matrix
|r| > 0.7 = strong; 0.4–0.7 = moderate; < 0.4 = weak
Always visualize — correlation ≠ causation

🧪 Lab 2 — Relationship Visualizations

Part A: Customer success metrics scatter exploration
Part B: Support response time impact regression plot
Part C: Revenue drivers pair plot and correlation matrix
Annotate key insights directly on each chart