Consolidated Question Patterns & Abstractions: Statistics I (Weeks 1-4)
This document synthesizes the core problem types and mental algorithms for the first four weeks of Statistics I. Use this to rapidly identify what a question is asking and which tools you need to solve it.
đ Table of Contents
- Week 1: Introduction to Statistics & Data
- Week 2: Describing Categorical Data
- Week 3: Describing Numerical Data
- Week 4: Association Between Two Variables
Week 1: Introduction to Statistics & Data
- Core Idea: Learning the fundamental vocabulary to classify and understand the nature of data.
| Pattern # | Pattern Name | Frequency | Difficulty | Core Skill & Abstraction |
|---|---|---|---|---|
| 1.1 | Population vs. Sample Identification | High | Easy | Abstract: Is it the entire group of interest (Population) or the subset you have data for (Sample)? Keywords: âallâ, âeveryâ vs. âselectedâ, âa group ofâ. |
| 1.2 | Inferential vs. Descriptive Logic | High | Easy | Abstract: Is the statement just describing the sample, or is it inferring a conclusion about the whole population? Keywords: âthe sample hadâŠâ vs. âwe conclude that allâŠâ. |
| 1.3 | Classifying Variable Types | High | Medium | Abstract: Apply the tests: 1. Math? (Can I average it? Yes=Numerical, No=Categorical). 2. Gaps? (Can it be a decimal? Yes=Continuous, No=Discrete). |
| 1.4 | Identifying the Scale of Measurement | High | Medium | Abstract: Use the NOIR framework: Nominal (names), Ordinal (order), Interval (equal intervals, no true zero), Ratio (true zero). |
đ§ Week 1 Mental Algorithm: The Classification Checklist
When asked to classify a variable (e.g., âEducation Levelâ):
- Triage: Itâs a classification problem.
- Abstract & Act:
- Math Test: Can I average âHigh Schoolâ and âPhDâ? No. Itâs Categorical.
- Order Test: Is there a natural ranking? Yes, PhD > High School. Itâs Ordinal.
- Final Answer: Categorical, Ordinal Scale.
Week 2: Describing Categorical Data
- Core Idea: Summarizing and visualizing data that falls into non-numerical groups using counts, proportions, and charts.
| Pattern # | Pattern Name | Frequency | Difficulty | Core Skill & Abstraction |
|---|---|---|---|---|
| 2.1 | Calculating Frequencies & Proportions | High | Easy | Abstract: Use the core relationship: Part = Whole Ă Percentage. Find the piece youâre missing. |
| 2.2 | Identifying Measures of Central Tendency | High | Easy | Abstract: For nominal data, only the Mode (most frequent) is defined. Mean and Median require order/numbers and are not applicable. |
| 2.3 | Choosing the Appropriate Graph | High | Medium | Abstract: Bar Chart to compare counts. Pie Chart to show percentages of a whole. Pareto Chart to identify the most important categories (a sorted bar chart). |
| 2.4 | Interpreting Graphical Representations | Medium | Easy | Abstract: Read the values directly from the chart. For pie charts, convert percentages to counts if a total is given. For bar charts, read the axis labels carefully. |
đ§ Week 2 Mental Algorithm: The Categorical Toolkit
When you see categorical data (e.g., a list of academies and the number of players in each):
- Triage: Itâs a categorical description problem.
- Abstract & Act:
- âWhat is the most common?â Find the Mode. Look for the highest bar in a bar chart or the biggest slice in a pie chart.
- âWhat share/proportionâŠ?â Calculate Relative Frequency:
(Frequency of Category) / (Total). - âHow to best visualizeâŠ?â If comparing counts, use a Bar Chart. If showing parts of a whole, a Pie Chart is an option. If prioritizing, a Pareto Chart.
Week 3: Describing Numerical Data
- Core Idea: Calculating statistics that measure the âcenterâ (central tendency) and âspreadâ (dispersion) of numerical datasets.
| Pattern # | Pattern Name | Frequency | Difficulty | Core Skill & Abstraction |
|---|---|---|---|---|
| 3.1 | Calculating Center & Spread | High | Medium | Abstract: Calculate Mean (average), Median (sorted middle), Mode (most frequent), Sample Variance (), and Sample Standard Deviation (). |
| 3.2 | Correcting Mean and Variance | High | Medium | Abstract: Work backward from the wrong statistic to find the wrong Sum or Sum of Squares. Correct the sum (Sum_correct = Sum_wrong - wrong_val + correct_val), then recalculate. |
| 3.3 | Calculating Percentiles and Quartiles | High | Medium | Abstract: 1. Sort the data. 2. Find the median (Q2). 3. Find the median of the lower half (Q1). 4. Find the median of the upper half (Q3). 5. Calculate IQR = Q3 - Q1. |
| 3.4 | Identifying Outliers | Medium | Medium | Abstract: Calculate the fences: Lower = Q1 - 1.5*IQR and Upper = Q3 + 1.5*IQR. Any data point outside this range is an outlier. |
đ§ Week 3 Mental Algorithm: The Numerical Description Flow
When given a list of numbers:
- Triage: Itâs a numerical description problem.
- Abstract & Act:
- First step is always to SORT the data.
- âFind the centerâ: Calculate Mean and Median. If they are very different, it hints at skewness or outliers.
- âFind the spreadâ: Calculate IQR (resistant to outliers) and Standard Deviation (sensitive to outliers).
- âCheck for outliersâ: Use the IQR and the 1.5*IQR fence rule.
Week 4: Association Between Two Variables
- Core Idea: Moving from describing one variable at a time to describing the relationship between two variables.
| Pattern # | Pattern Name | Frequency | Difficulty | Core Skill & Abstraction |
|---|---|---|---|---|
| 4.1 | Calculating Covariance and Correlation | High | Medium | Abstract: A procedural calculation, best done with a table. Find means, then deviations from the mean for both variables, then products of deviations. Sum these products to find covariance, then standardize to find correlation r. |
| 4.2 | Interpreting the Correlation Coefficient r | High | Easy | Abstract: Look at the sign for direction (positive/negative) and the magnitude for strength (close to 1 or -1 is strong; close to 0 is weak). |
| 4.3 | Analyzing Contingency Tables | High | Medium | Abstract: Differentiate between Marginal (uses grand totals in the denominator) and Conditional (uses a row or column total in the denominator) proportions. Read the question carefully to find the correct âwholeâ. |
| 4.4 | Conceptual Understanding of Correlation | Medium | Easy | Abstract: Remember the key rules: Correlation â Causation. Correlation only measures linear relationships. A perfect linear relationship means or . |
đ§ Week 4 Mental Algorithm: The Relationship Analysis Flow
When given a dataset with two variables, X and Y:
- Triage: Itâs a relationship/association problem.
- Abstract & Act:
- Are X and Y both categorical? Build a Contingency Table. Analyze it by calculating conditional proportions.
- Are X and Y both numerical? Your goal is to find the Correlation Coefficient
r.- Visualize with a Scatterplot in your mind. Does it seem to go up or down?
- Calculate Covariance. The sign will confirm your visual check.
- Calculate Standard Deviations for both X and Y.
- Calculate .
- Interpret
r: State the direction (positive/negative) and strength (weak/moderate/strong).