Box-and-Whisker Plots
Display data using five-number summary and identify quartiles and outliers.
What is a Box-and-Whisker Plot?
A box-and-whisker plot (or box plot) shows how data is distributed using five key numbers.
Parts:
- Minimum: smallest value
- Q1 (First Quartile): 25% mark
- Median (Q2): middle value (50% mark)
- Q3 (Third Quartile): 75% mark
- Maximum: largest value
Visual:
|----[=====|=====]----|
min Q1 median Q3 max
←whisker→ ←box→ ←whisker→
The Five-Number Summary
Steps to find it:
- Order data from least to greatest
- Find minimum and maximum
- Find median (middle)
- Find Q1 (median of lower half)
- Find Q3 (median of upper half)
Example: Find Five-Number Summary
Data: 3, 7, 8, 5, 12, 14, 21, 13, 18
Step 1: Order the data
- 3, 5, 7, 8, 12, 13, 14, 18, 21
Step 2: Find minimum and maximum
- Minimum: 3
- Maximum: 21
Step 3: Find median (Q2)
- 9 values → middle is 5th value
- Median: 12
Step 4: Find Q1 (median of lower half)
- Lower half: 3, 5, 7, 8
- Median of 4 values: (5 + 7) ÷ 2 = 6
- Q1: 6
Step 5: Find Q3 (median of upper half)
- Upper half: 13, 14, 18, 21
- Median: (14 + 18) ÷ 2 = 16
- Q3: 16
Five-Number Summary: 3, 6, 12, 16, 21
Drawing a Box-and-Whisker Plot
Example: Draw the plot for 3, 6, 12, 16, 21
Step 1: Draw a number line that includes the range
Step 2: Mark all five values above the line
Step 3: Draw a box from Q1 to Q3
Step 4: Draw a line inside the box at the median
Step 5: Draw whiskers from box to min and max
|----[==|==]----|
3 6 12 16 21
Understanding Quartiles
Quartiles divide data into four equal parts:
- Q1: 25% of data is below this value
- Median (Q2): 50% of data is below this value
- Q3: 75% of data is below this value
Interquartile Range (IQR): Q3 − Q1
- Shows the spread of the middle 50% of data
Example: Find IQR
Five-number summary: 3, 6, 12, 16, 21
IQR = Q3 − Q1 = 16 − 6 = 10
The middle 50% of data spans 10 units.
Reading Box-and-Whisker Plots
Example: Analyze this plot
|---[====|====]--------|
20 30 40 50 80
Five-number summary: 20, 30, 40, 50, 80
Observations:
- Range: 80 − 20 = 60
- IQR: 50 − 30 = 20
- Median: 40
- Right whisker is longer: data is more spread out on the high end
Identifying Outliers
Outlier: A value that is much higher or lower than the rest
Outlier rule:
- Below Q1 − 1.5 × IQR
- Above Q3 + 1.5 × IQR
Example: Check for Outliers
Data: 5, 6, 7, 8, 9, 10, 11, 25
Five-number summary: 5, 6.5, 8.5, 10.5, 25
IQR = 10.5 − 6.5 = 4
Lower boundary: 6.5 − 1.5(4) = 6.5 − 6 = 0.5 Upper boundary: 10.5 + 1.5(4) = 10.5 + 6 = 16.5
Check 25: 25 > 16.5 → Yes, 25 is an outlier!
Mark outliers with * and draw whisker to largest non-outlier:
|---[==|==]---| *
5 6.5 8.5 10.5 25
11
Comparing Data Sets
Box plots make it easy to compare multiple data sets!
Example: Test Scores
Class A: min=65, Q1=72, median=80, Q3=88, max=95 Class B: min=55, Q1=68, median=75, Q3=82, max=98
Class A: |---[====|====]--|
65 72 80 88 95
Class B: |----[====|===]----|
55 68 75 82 98
Observations:
- Class A has higher median (80 vs 75)
- Class A has less variability (smaller IQR)
- Class B has wider range
Advantages of Box Plots
Shows:
- Center (median)
- Spread (range and IQR)
- Skewness (symmetric vs. skewed)
- Outliers
Good for:
- Large data sets
- Comparing multiple groups
- Seeing distribution shape
Symmetric vs. Skewed Data
Symmetric: Median is centered in box, whiskers similar length
|----[==|==]----|
Right-skewed: Right whisker longer
|--[==|==]--------|
Left-skewed: Left whisker longer
|--------[==|==]--|
Real-World Applications
Sports: Compare player statistics
- Salaries, ages, performance metrics
Education: Analyze test scores
- Compare classes, identify struggling students
Weather: Temperature distributions
- Daily highs/lows across seasons
Business: Sales data
- Identify best/worst performers, outliers
Practice
What is the five-number summary for: 2, 4, 6, 8, 10, 12, 14?
If Q1 = 20 and Q3 = 35, what is the IQR?
Data has Q1=10, Q3=20, IQR=10. What value would be an outlier?
If the right whisker is much longer than the left whisker, the data is: