Lesson 5
Normal Distributions
5.1: Body Temperature (5 minutes)
Warm-up
The purpose of this warm-up is to introduce the term normal distribution, which will be useful when students interpret a normal curve in a later activity. By articulating things they notice about the histograms, students have an opportunity to attend to precision in the language they use to describe what they see (MP6). They might first propose less formal or imprecise language, and then restate their observation with more precise language in order to communicate more clearly.
Launch
Arrange students in groups of two. Tell students that they will be writing down what they think the elements are of definitions of normal distribution. Give students 1 minute of quiet think time, 1 minute to record their thinking, and then 1 minute to discuss the elements they recorded with their partner, followed by a whole-class discussion.
Student Facing
Each histogram represents a group of 500 healthy people who had their temperature taken. Three histograms represent examples of data that approximate a normal distribution and three histograms represent non-examples, or data that do not approximate a normal distribution. What do you think the elements are of a definition of normal distribution?
Examples:
Non-examples:
Student Response
Teachers with a valid work email address can click here to register or sign in for free access to Student Response.
Activity Synthesis
Ask students to share what they think the elements are of a normal distribution. Record and display their responses for all to see. If possible, record the relevant reasoning on or near the images.
After all responses have been recorded without commentary or editing, ask students to discuss: “What are some of the properties of a normal distribution?” (It is a distribution that is bell-shaped, symmetric, and never is greater than 1.)
Display two of the example histograms with the normal curves that model the same data.
To clarify the last point, tell students that the total area between the \(x\)-axis and the curve should be 1. That is, if the entire area under the bell-shaped curve is shaded (rather than just the bars from the histogram that are sometimes under the curve and sometimes outside of it), the shaded region would have an area of 1.
Tell students, “A normal distribution is a specific distribution in statistics that is symmetric and bell-shaped, has an area of 1 between the \(x\)-axis and the curve, and has the \(x\)-axis as a horizontal asymptote.” Tell students that “distributions that are approximately bell-shaped and symmetric can often be modeled by a normal distribution in a similar way to how scatter plot data could be modeled with a line or other curve. A normal distribution is determined entirely by the mean and standard deviation.”
Refer to the displayed images and point out that the curve that represents the normal distribution with the same mean and standard deviation as the data from the histogram is shown on top of the relative frequency histogram for the actual data.
5.2: Playing a Piano (15 minutes)
Activity
The mathematical purpose of this activity is for students to collect, represent, and interpret data using a histogram. For most groups of high school students, the distribution of handspans should be approximately normal. If the distribution for your group is not approximately normal, display the distribution from the student response provided in these materials as the data from another class and ask students to comment on that distribution.
Making statistical technology available gives students an opportunity to choose appropriate tools strategically (MP5).
Launch
Tell students that they will measure the distance from their thumb to smallest finger when their hand is stretched, rounded to the nearest tenth of a centimeter, and then tell you the measurement.
Design Principle(s): Maximize meta-awareness; Support sense-making
Supports accessibility for: Language; Conceptual processing
Student Facing
On many piano keyboards, the distance from one white key to the next is 2.39 centimeters. How many of your classmates could reach two notes that are 9 keys apart (21.5 cm) on a keyboard using only one hand?
- Stretch your fingers apart as wide as you can and measure the farthest distance from your thumb to smallest finger. Round your measurement to the nearest tenth of a centimeter.
- Your teacher will collect the measurements from the class. Draw a dot plot or histogram from the class data.
- Describe the distribution you drew using terms such as: symmetric, approximately symmetric, skewed left, skewed right, approximately uniform, uniform, bell-shaped, or bimodal. Estimate the center of your distribution.
- How would you use your distribution to determine how many people in the class can reach the two notes 9 keys apart?
Student Response
Teachers with a valid work email address can click here to register or sign in for free access to Student Response.
Activity Synthesis
The purpose of this discussion is for students to recognize an approximately symmetric and bell-shaped histogram in context. Display a histogram representing the class data for all to see. If possible, display the distributions from other classes and combine the measurements from all the classes to see a distribution with more data. Select students to share their descriptions of the distributions they drew. Ask each student selected to explain their reasoning for that shape.
Here are some questions for discussion:
- “What is the shape of the distribution?” (Approximately symmetric and bell-shaped.)
- “What is the center of the distribution? Explain your reasoning.” (It is about 21 cm because the two middle bars appear in the center of the data and the middle of them is approximately 21 cm.)
- “How did you estimate how many students could reach the keys?” (I found the total number of students with a measurement of 21.5 cm or more.)
- “Do you think having a measurement of 21.5 cm or above would allow you to play the notes on the keyboard? Explain your reasoning.” (I do not think that everyone with a measurement of 21.5 cm would be able to play the notes because they might not have the same stretch while they are actually playing.)
- “Does the data look like it could be modeled with a normal distribution?” (Yes. I think a normal distribution would model it well because the data is bell-shaped and approximately symmetric.)
Sketch the normal curve on the classroom data for all to see (either by hand or with technology). Ask, “Does the normal curve model the data well?” (Yes, it models the data well, but not exactly.) Note: for this discussion, the sample data provided should be sufficient. If you choose to include class data, you can do so by changing entries in Column A, but do not delete any.
5.3: Relative Frequency Distribution (10 minutes)
Activity
The mathematical purpose of this activity is for students to use relative frequency histograms to estimate population percentages. Students convert a frequency table into a relative frequency table and a histogram to a relative frequency histogram to highlight the proportion of values in certain intervals. In later lessons, students will find areas under normal curves to estimate proportions of values in various regions as well.
Launch
Tell students that they are going to use the data represented in the histogram to calculate relative frequencies and then use the relative frequencies to create another histogram called a relative frequency histogram.
Making spreadsheet technology available gives students an opportunity to choose appropriate tools strategically (MP5).
Supports accessibility for: Conceptual processing
Student Facing
Manufacturers of butter make sticks of butter that weigh 110 grams on average. A manufacturer suspects the machine that forms the sticks of butter may have a problem, so they weigh each stick of butter the machine produces in an hour.
The weights are grouped into intervals of 0.5 grams and are summarized in a frequency table.
weight (grams) | frequency | relative frequency |
---|---|---|
107–107.5 | 5 | |
107.5–108 | 17 | |
108–108.5 | 52 | |
108.5–109 | 118 | |
109–109.5 | 172 | |
109.5–110 | 232 | |
110–110.5 | 219 | |
110.5–111 | 172 | |
111–111.5 | 95 | |
111.5–112 | 57 | |
112–112.5 | 23 | |
112.5–113 | 8 | |
113–113.5 | 1 | |
total | 1,171 |
The same data are summarized in this histogram.
Although this information is useful, it might be more helpful to know the proportion of sticks of butter in each weight interval rather than the actual number of sticks in that weight interval.
- Complete the table by dividing each frequency value by the total number of sticks of butter in the data set. Round each value to 4 decimal places.
- A relative frequency histogram is a histogram in which the height of each bar is the relative frequency. Since the heights of the bars are found by dividing each height by the total number of sticks of butter, the shape of the distribution is the same as a regular histogram, but the labels on the \(y\)-axis are changed. Label the \(y\)-axis with the correct values for each mark.
- The manufacturer believes they should replace the machine if more than 25% of the sticks of butter are more than 1 gram away from the intended value of 110 grams.
- Indicate on the relative frequency histogram the bars that correspond to sticks of butter that are more than 1 gram away from the intended weight of a stick of butter.
- Should this machine be replaced? Explain or show your reasoning.
Student Response
Teachers with a valid work email address can click here to register or sign in for free access to Student Response.
Anticipated Misconceptions
Some students may struggle to put the labels on the axis for the relative frequency histogram. Ask students how they found the relative frequencies for the table and whether they can use that same method for the marks on the relative frequency histogram.
Activity Synthesis
The purpose of this discussion is how to use a relative frequency histogram to estimate population percentages. Here are some questions for discussion:
- “How much of the data is represented by the two tallest bars in the relative frequency histogram?” (About 34.5%)
- “What information does the relative frequency histogram provide that the original histogram did not?” (The relative frequency histogram tells you what percentage of the data is represented by each bar. You can figure that out from the original histogram, but you would have to divide by the total number of data values in the data set.)
- “On another day, the mean of the data set remains the same, but the standard deviation decreases. How would you expect the relative frequency histogram to change? Explain your reasoning.” (I think that the bars in the middle would get taller and the bars to the left and right of the middle would get shorter or disappear. Since the mean stays the same, I would not expect where the middle bars are located to change, but since the standard deviation decreases, I know that there is less variability in the data, so I would expect less spread in the histogram. In order for that to happen, I would expect the data in the middle to be more frequent.)
5.4: The Normal Curve (5 minutes)
Activity
The mathematical purpose of this activity is to introduce students to curves representing normal distributions. Students compare normal curves with different means and standard deviations to determine how those statistics affect the curve. In particular, ensure students understand that normal curves are entirely defined by the mean and standard deviation, that the mean defines the center of the distribution, and the standard deviation affects the spread of the distribution.
Student Facing
These curves represent normal distributions with different means and standard deviations. What do you notice?
Student Response
Teachers with a valid work email address can click here to register or sign in for free access to Student Response.
Student Facing
Are you ready for more?
The equation for the curve for a normal distribution is
\(f(x) = \frac{1}{\sigma \sqrt{2\pi}} \boldcdot e^{\frac{-(x-\mu)^2}{2\sigma^2}}\)
where \(\sigma\) is the standard deviation, \(\mu\) is the mean, and \(e\) is a particular number close to 2.718.
Notice the part of the function that is \((x - \mu)\). Using your understanding of transformations of functions, how does changing \(\mu\) affect the graph of the function?
Student Response
Teachers with a valid work email address can click here to register or sign in for free access to Extension Student Response.
Activity Synthesis
The important things for students to notice are:
- The curves are entirely defined by the mean and standard deviation.
- The curve is symmetric around the mean.
- The standard deviation affects the width and height of the curve. A larger standard deviation is wider and shorter. A smaller standard deviation is thinner and taller.
Here are some questions for discussion:
- “Why do you think the normal curve for a larger standard deviation is wider and shorter than one with a smaller standard deviation?” (We would expect a larger standard deviation to mean that data is more spread out, so it makes sense that is wider. Since the area must remain 1 under the curve, it has to get shorter to get wider.)
- “What is the shape of the normal curve?” (Bell-shaped and symmetric)
- “Where is the line of symmetry for each curve?” (\(x = 10\), \(x = 8\), \(x = 12\), and \(x = 10\), respectively)
Tell students,
- “Although it is difficult to see in these graphs, the \(x\)-axis is a horizontal asymptote for normal curves. As you follow the curve to the right (or left), the curve gets closer and closer to the \(x\)-axis without ever actually reaching it.”
- “Normal curves are bell-shaped and are most reliably used to model bell-shaped or approximately bell-shaped distributions.”
- “The area under the normal curve (between the curve and the \(x\)-axis) is always 1. When the standard deviation increases, the curve spreads out to the right and left. To maintain the same area, though, the height of the peak must decrease.”
Lesson Synthesis
Lesson Synthesis
Here are some questions for discussion:
- “When is it appropriate to model data with a normal distribution?” (It is appropriate when the distribution is approximately symmetric and bell-shaped.)
- “How can you create a relative frequency histogram?” (You divide the number of data values represented by each bar in a histogram and divide by the total number of values represented in the histogram. Then you create a new histogram with a different vertical axis.)
- “Why could it be helpful to model data with a normal distribution?” (It could be helpful because you could make generalizations about different data sets that have similar shapes.)
- “What is the area under a normal curve?” (One)
- “Why is a normal curve representing a distribution with a mean of 10 and a standard deviation of 2 taller and skinnier than a normal curve representing a distribution with a mean of 10 and a standard deviation of 3?” (It is skinnier because the data is closer to the mean. It is taller because an area of 1 needs to be preserved.)
5.5: Cool-down - What’s Normal? (5 minutes)
Cool-Down
Teachers with a valid work email address can click here to register or sign in for free access to Cool-Downs.
Student Lesson Summary
Student Facing
A histogram shows the number of items in the data set that fall into specified intervals. A relative frequency histogram shows the proportion of the entire data set that falls into specified intervals.
For example, a study measured the handspan of 1,000 adults in centimeters. A handspan is the distance from the thumb to the smallest finger when the fingers are stretched out as much as possible. A histogram shows the number of people whose handspans are in certain intervals. The height of each bar in a histogram represents the frequency for the corresponding interval. In this example, there are 132 adults in the study whose handspans are at least 20 centimeters, but less than 20.5 centimeters.
A histogram that uses the relative frequencies shows a distribution with the same shape, but the heights of the bars represent the relative frequency for the corresponding intervals. For example, out of all the adults in the study, 13.2% have handspans that are at least 20 centimeters, but less than 20.5 centimeters (since \(\frac{132}{1,000} = 0.132\)).
In a similar way to how we can model data in a scatter plot with a line or other curve so that additional information can be estimated or predicted, it can be useful to model an approximately symmetric and bell-shaped distribution with a particular distribution called the normal distribution. A normal distribution is symmetric and bell-shaped, has an area of 1 between the \(x\)-axis and the curve, and has the \(x\)-axis as a horizontal asymptote. A normal distribution is determined entirely by the mean and standard deviation.
For the handspan data, the mean is 20.9 cm and the standard deviation is 1.41. The curve that represents the normal distribution with this same mean and standard deviation is shown on top of the relative frequency histogram for the actual data. Notice that the curve does a fairly good job of modeling the actual data in this situation, although it is not perfect.