Stats 1 Week 4: Dispersion & Correlation
0. Prerequisites
NOTE
What you need to know:
- Square Roots: .
- Graphing: Plotting points.
- Linear Equation: (Slope and Intercept).
Quick Refresher
- Dispersion: Spread of data. (Is it clustered or scattered?).
- Correlation (): Strength of linear relationship between two variables.
- Variance (): Average squared distance from Mean.
- Standard Deviation (): Square root of Variance.
1. Core Concepts
1.1 Measures of Dispersion
- Range: Max - Min. (Very sensitive to outliers).
- Inter-Quartile Range (IQR): . (Robust against outliers).
- Outlier Rule: Any value outside .
- Standard Deviation (SD):
- Population SD (): Divide by .
- Sample SD (): Divide by (Bessel’s Correction).
- Why ? To make it an “unbiased estimator” (better guess) of the population.
1.2 Correlation ()
- Range: .
- Interpretation:
- : Perfect Positive (Line going up).
- : Perfect Negative (Line going down).
- : No Linear relationship (Could still be curved!).
- Effect of Scale:
- Adding constant (): No change in .
- Multiplying positive constant (): No change in .
- Multiplying negative constant (): Sign flips (becomes ).
2. Pattern Analysis & Goated Solutions
Pattern 1: Effect of Operations on Variance/SD
Context: “Variance of data is 25. If we multiply every number by 4 and add 7, what is new Variance?”
TIP
Mental Algorithm:
- Addition (+7): IGNORE. Shifting data doesn’t change spread.
- Multiplication ():
- New SD = Old SD .
- New Variance = Old Variance .
Example (Detailed Solution)
Problem: Var = 25. Op: . Solution:
- Ignore +7.
- Scale Factor: 4.
- New Var: . Answer: 400.
Pattern 2: Finding Outliers
Context: “Data: 8, 17, 15, 19, 21, 25, 23, 35. Find outliers.”
TIP
Mental Algorithm:
- Sort: 8, 15, 17, 19, 21, 23, 25, 35.
- Find Q1, Q3:
- . Q1 is avg of 2nd/3rd (15, 17) 16.
- Q3 is avg of 6th/7th (23, 25) 24.
- Calculate IQR: .
- Calculate Fences:
- Lower: .
- Upper: .
- Check: Any value or ?
- Min is 8 (Safe). Max is 35 (Safe).
- Result: 0 Outliers.
Pattern 3: Correlation Changes
Context: “. New variables: , . Find .”
TIP
Mental Algorithm:
- Check Signs: Are multipliers positive or negative?
- mult by 1 (Pos). mult by 2 (Pos).
- Result: Signs match stays same.
- If one was negative flips sign.
Example (Detailed Solution)
Problem: . , . Solution:
- Shift (+0.1): Ignore.
- Scale (2): Positive.
- Result: remains 0.45. Answer: 0.45.
3. Practice Exercises
- Variance: If SD is 5, what is Variance?
- Hint: .
- Scaling: SD is 10. Multiply data by -3. New SD?
- Hint: SD is always positive. .
- Correlation: . Find .
- Hint: flipped sign. .
🧠 Level Up: Advanced Practice
Question 1: Correlation under Transformation
Problem: Salespersons A and B. . Correlation coefficient ()? Logic:
- Linear Relation: .
- Property: If , . If , .
- Here: (Positive). Answer: . (Perfect positive correlation).
Question 2: Variance Shift
Problem: Variance of dataset is 50. If we add 4 to every number, what is new variance? Logic:
- Shift Property: Adding a constant () shifts the distribution but does not change the spread.
- Result: Variance remains same. Answer: 50. Note: If we multiplied by 2, variance would become .
Question 3: Interpreting Scatter Plots
Problem: Daily screen time vs Sleep duration. Negative trend. Interpretation:
- As screen time increases, sleep duration decreases.
- Correlation: Negative ().
- Causation?: Correlation does not imply causation, but there is an association.