Learn on PengiCalifornia Reveal Math, Algebra 1Unit 11: Statistics

11-3 Distributions of Data

In this Grade 9 lesson from California Reveal Math, Algebra 1 (Unit 11: Statistics), students learn to analyze the shapes of data distributions — including symmetric, negatively skewed, and positively skewed distributions — using histograms and box plots. Students also develop the skill of selecting appropriate summary statistics, choosing between mean and standard deviation for symmetric data or the five-number summary for skewed data, and identifying outliers. The lesson connects graphical representations to real-world contexts such as movie ticket sales to build statistical reasoning.

Section 1

Shapes of Distributions (Symmetric vs. Skewed)

Property

The shape of a data distribution reveals its "personality." When viewing histograms or box plots, we classify the shape into three main categories:

  • Symmetric: Data is evenly spread around the center. On a box plot, the median is perfectly in the middle of the box, and the whiskers are equal in length.
  • Skewed Right (Positively Skewed): Most data clusters on the left, with a long "tail" stretching to the right. On a box plot, the median is pushed to the left side of the box (Q1Q_1), and the right whisker is much longer.
  • Skewed Left (Negatively Skewed): Most data clusters on the right, with a long "tail" stretching to the left. On a box plot, the median is pushed to the right side of the box (Q3Q_3), and the left whisker is much longer.

Note on Bin Width: When using technology to graph a histogram, choosing the wrong bin width can artificially hide the true shape. Bins that are too wide will make a skewed distribution look deceptively symmetric, while bins that are too narrow will create jagged, fake gaps.

Examples

  • Symmetric: A dot plot of daily temperatures shows values clustered evenly around 72°F. The left and right sides look like mirror images.
  • Skewed Right: A histogram of house prices shows most homes cost between 200k200k–300k (a tall peak on the left), but a few $1M+ mansions create a long tail dragging to the right.
  • Skewed Left: A box plot of retirement ages has Q1=58Q_1 = 58, Median = 65, and Q3=68Q_3 = 68. The left side of the box (58 to 65) is much wider than the right side (65 to 68), and the left whisker stretches far out to early retirees at age 45.

Section 2

How Shape Affects the Mean and Median

Property

The shape of the distribution completely changes the relationship between the Mean and the Median:

  • Symmetric: Mean \approx Median. Both sit perfectly in the center.
  • Skewed Right: Mean >> Median. Extreme high values pull the mean to the right.
  • Skewed Left: Mean << Median. Extreme low values pull the mean to the left.

Examples

  • Skewed Right Example: In a neighborhood, most homes cost 200,000(Median),butonemansioncosts200,000 (Median), but one mansion costs 2,000,000. This massive outlier pulls the average (Mean) up to 450,000.TheMedian(450,000. The Median (200k) is a much more honest representation of the "typical" home.
  • Skewed Left Example: Most students score an 85 or 90 on a test (Median), but two students fall asleep and score a 10. These low scores pull the class average (Mean) down to a 72. The Median (85) better represents how the typical student performed.

Explanation

Section 3

Identifying Outliers Using the 1.5×IQR Rule

Property

Outliers are data values that fall outside the normal range of a dataset.
Using the IQR method, outliers are identified by calculating boundary values:

Lower boundary: Q11.5×IQRQ_1 - 1.5 \times IQR

Lesson overview

Expand to review the lesson summary and core properties.

Expand

Section 1

Shapes of Distributions (Symmetric vs. Skewed)

Property

The shape of a data distribution reveals its "personality." When viewing histograms or box plots, we classify the shape into three main categories:

  • Symmetric: Data is evenly spread around the center. On a box plot, the median is perfectly in the middle of the box, and the whiskers are equal in length.
  • Skewed Right (Positively Skewed): Most data clusters on the left, with a long "tail" stretching to the right. On a box plot, the median is pushed to the left side of the box (Q1Q_1), and the right whisker is much longer.
  • Skewed Left (Negatively Skewed): Most data clusters on the right, with a long "tail" stretching to the left. On a box plot, the median is pushed to the right side of the box (Q3Q_3), and the left whisker is much longer.

Note on Bin Width: When using technology to graph a histogram, choosing the wrong bin width can artificially hide the true shape. Bins that are too wide will make a skewed distribution look deceptively symmetric, while bins that are too narrow will create jagged, fake gaps.

Examples

  • Symmetric: A dot plot of daily temperatures shows values clustered evenly around 72°F. The left and right sides look like mirror images.
  • Skewed Right: A histogram of house prices shows most homes cost between 200k200k–300k (a tall peak on the left), but a few $1M+ mansions create a long tail dragging to the right.
  • Skewed Left: A box plot of retirement ages has Q1=58Q_1 = 58, Median = 65, and Q3=68Q_3 = 68. The left side of the box (58 to 65) is much wider than the right side (65 to 68), and the left whisker stretches far out to early retirees at age 45.

Section 2

How Shape Affects the Mean and Median

Property

The shape of the distribution completely changes the relationship between the Mean and the Median:

  • Symmetric: Mean \approx Median. Both sit perfectly in the center.
  • Skewed Right: Mean >> Median. Extreme high values pull the mean to the right.
  • Skewed Left: Mean << Median. Extreme low values pull the mean to the left.

Examples

  • Skewed Right Example: In a neighborhood, most homes cost 200,000(Median),butonemansioncosts200,000 (Median), but one mansion costs 2,000,000. This massive outlier pulls the average (Mean) up to 450,000.TheMedian(450,000. The Median (200k) is a much more honest representation of the "typical" home.
  • Skewed Left Example: Most students score an 85 or 90 on a test (Median), but two students fall asleep and score a 10. These low scores pull the class average (Mean) down to a 72. The Median (85) better represents how the typical student performed.

Explanation

Section 3

Identifying Outliers Using the 1.5×IQR Rule

Property

Outliers are data values that fall outside the normal range of a dataset.
Using the IQR method, outliers are identified by calculating boundary values:

Lower boundary: Q11.5×IQRQ_1 - 1.5 \times IQR