Stats 1 Week 4: Dispersion & Correlation

0. Prerequisites

NOTE

What you need to know:

  • Square Roots: .
  • Graphing: Plotting points.
  • Linear Equation: (Slope and Intercept).

Quick Refresher

  • Dispersion: Spread of data. (Is it clustered or scattered?).
  • Correlation (): Strength of linear relationship between two variables.
  • Variance (): Average squared distance from Mean.
  • Standard Deviation (): Square root of Variance.

1. Core Concepts

1.1 Measures of Dispersion

  1. Range: Max - Min. (Very sensitive to outliers).
  2. Inter-Quartile Range (IQR): . (Robust against outliers).
    • Outlier Rule: Any value outside .
  3. Standard Deviation (SD):
    • Population SD (): Divide by .
    • Sample SD (): Divide by (Bessel’s Correction).
    • Why ? To make it an “unbiased estimator” (better guess) of the population.

1.2 Correlation ()

  • Range: .
  • Interpretation:
    • : Perfect Positive (Line going up).
    • : Perfect Negative (Line going down).
    • : No Linear relationship (Could still be curved!).
  • Effect of Scale:
    • Adding constant (): No change in .
    • Multiplying positive constant (): No change in .
    • Multiplying negative constant (): Sign flips (becomes ).

2. Pattern Analysis & Goated Solutions

Pattern 1: Effect of Operations on Variance/SD

Context: “Variance of data is 25. If we multiply every number by 4 and add 7, what is new Variance?”

TIP

Mental Algorithm:

  1. Addition (+7): IGNORE. Shifting data doesn’t change spread.
  2. Multiplication ():
    • New SD = Old SD .
    • New Variance = Old Variance .

Example (Detailed Solution)

Problem: Var = 25. Op: . Solution:

  1. Ignore +7.
  2. Scale Factor: 4.
  3. New Var: . Answer: 400.

Pattern 2: Finding Outliers

Context: “Data: 8, 17, 15, 19, 21, 25, 23, 35. Find outliers.”

TIP

Mental Algorithm:

  1. Sort: 8, 15, 17, 19, 21, 23, 25, 35.
  2. Find Q1, Q3:
    • . Q1 is avg of 2nd/3rd (15, 17) 16.
    • Q3 is avg of 6th/7th (23, 25) 24.
  3. Calculate IQR: .
  4. Calculate Fences:
    • Lower: .
    • Upper: .
  5. Check: Any value or ?
    • Min is 8 (Safe). Max is 35 (Safe).
    • Result: 0 Outliers.

Pattern 3: Correlation Changes

Context: “. New variables: , . Find .”

TIP

Mental Algorithm:

  1. Check Signs: Are multipliers positive or negative?
    • mult by 1 (Pos). mult by 2 (Pos).
  2. Result: Signs match stays same.
    • If one was negative flips sign.

Example (Detailed Solution)

Problem: . , . Solution:

  1. Shift (+0.1): Ignore.
  2. Scale (2): Positive.
  3. Result: remains 0.45. Answer: 0.45.

3. Practice Exercises

  1. Variance: If SD is 5, what is Variance?
    • Hint: .
  2. Scaling: SD is 10. Multiply data by -3. New SD?
    • Hint: SD is always positive. .
  3. Correlation: . Find .
    • Hint: flipped sign. .

🧠 Level Up: Advanced Practice

Question 1: Correlation under Transformation

Problem: Salespersons A and B. . Correlation coefficient ()? Logic:

  1. Linear Relation: .
  2. Property: If , . If , .
  3. Here: (Positive). Answer: . (Perfect positive correlation).

Question 2: Variance Shift

Problem: Variance of dataset is 50. If we add 4 to every number, what is new variance? Logic:

  1. Shift Property: Adding a constant () shifts the distribution but does not change the spread.
  2. Result: Variance remains same. Answer: 50. Note: If we multiplied by 2, variance would become .

Question 3: Interpreting Scatter Plots

Problem: Daily screen time vs Sleep duration. Negative trend. Interpretation:

  • As screen time increases, sleep duration decreases.
  • Correlation: Negative ().
  • Causation?: Correlation does not imply causation, but there is an association.