6.1: The Plot of the Story (5 minutes)
This warm-up reinforces students’ understanding about the relationship between the mean absolute deviation (MAD) and the spread of data. In the given scenarios, the number of people attending the two events and their mean age are the same, but the MADs are different. In the first question, students interpret these measures in the context of the situations. In the second, they draw a dot plot that could represent an age distribution with the same mean and yet another MAD.
As students work and discuss, identify several students who drew dot plots that correctly meet the criteria in the second question. Ask students with different dot plots to share during whole-class discussion.
Students may need more time to make sense of how to generate their own dot plot for the second question. If it is not possible to give students additional time, consider presenting the second question at a different time.
Arrange students in groups of 2. Give students 1 minute of quiet think time for the first question, and then 2–3 minutes to work on the second question with a partner. Display the following questions for all to see. Ask students to think about and discuss them before drawing their dot plots:
- “How many data points should be on the dot plot?”
- “How would the mean help us place the data points?”
- “How would the MAD help us place the data points?”
- “How should our dot plot compare to the dot plots of data sets A and B?”
Here are two dot plots and two stories. Match each story with a dot plot that could represent it. Be prepared to explain your reasoning.
Twenty people—high school students, teachers, and invited guests—attended a rehearsal for a high school musical. The mean age was 38.5 years and the MAD was 16.5 years.
- High school soccer team practice is usually watched by supporters of the players. One evening, twenty people watched the team practice. The mean age was 38.5 years and the MAD was 12.7 years.
- Another evening, twenty people watched the soccer team practice. The mean age was similar to that from the first evening, but the MAD was greater (about 20 years).
Make a dot plot that could illustrate the distribution of ages in this story.
Poll students on their response to the first question. Ask a student to explain how they matched one context to its dot plot and another student to explain the second matching context and dot plot. Record and display their responses for all to see. If possible, record their responses directly on the dot plots.
Ask selected students to share their dot plots for the second question and their reasoning. To involve more students in the conversation, consider asking some of the following questions:
- “What was the first piece of information you used to draw your dot plot? Why?”
- “How did you decide where to place your dots?”
- “How is your dot plot the same or different than the first evening of soccer practice?”
- “Do you agree or disagree with this representation of the context? Why?”
- “Do you have any questions to ask the student who drew the dot plot?”
6.2: Finding the Middle (15 minutes)
This activity introduces students to the term median. They learn that the median describes the middle value in an ordered list of data, and that it can capture what we consider typical for the data in some cases.
Students learn about the median through a kinesthetic activity. They line up in order of the number of letters in their name. Then, those at both ends of the line count off and sit down simultaneously until one or two people in the middle remain standing. If one person remains standing, that person has the median number of letters. If two people remain standing, the median is the mean or the average of their two values.
Students then practice identifying the median of other data sets, by analyzing both tables of values and dot plots.
Explain to students that, instead of using the mean, sometimes we use the “middle” value in an ordered list of data set as a measure of center. We call this the median. Guide students through the activity:
Give each student an index card. Ask them to write their first and last names on the card and record the total number of letters in their name. Display an example for all to see.
Ask students to stand up, holding their index cards in front of them, and arrange themselves in order based on the number of letters in their name. (Consider asking students to do so without speaking at all.) Look for the student whose name has the fewest letters and ask him or her to be the left end of the line. Ask the student with the longest full name to be the right end of the line. Students who have the same number of letters should stand side-by-side.
Tell students that, to find the median or the middle number, we will count off from both ends at the same time. Ask the students at the two ends of the line say “1” at the same time and sit on the floor, and the students next to them to say “2” and then sit down, and so on. Have students count off in this fashion until only one or two students are standing.
If the class has an odd number of students, one student will remain standing. Tell the class that this student’s number is the median. Give this student a sign that says “median” If the class has an even number of students, two students will remain standing. The median will be the mean or average of their numbers. Ask both students to hold the sign that says “median.” Explain that the median is also called the “50th percentile,” because half of the data values are the same size or less than it and fall to the left of it on the number line, and half are the same size or greater than it and fall to the right.
Ask students to find the median a couple more times by changing the data set (e.g., asking a few students to leave the line or adding new people who are not members of the class with extremely long or short names). Make sure that students have a chance to work with both odd and even numbers of values.
Collect the index cards and save them; they will be used again in the lesson on box plots.
Ask students to complete the rest of the questions on the task statement.
Supports accessibility for: Memory; Language
Your teacher will give you an index card. Write your first and last names on the card. Then record the total number of letters in your name. After that, pause for additional instructions from your teacher.
Here is the data set on numbers of siblings from an earlier activity.
- Sort the data from least to greatest, and then find the median.
- In this situation, do you think the median is a good measure of a typical number of siblings for this group? Explain your reasoning.
Here is the dot plot showing the travel time, in minutes, of Elena’s bus rides to school.
- Find the median travel time. Be prepared to explain your reasoning.
- What does the median tell us in this context?
When determining the median, students might group multiple data points that have the same value and treat it as a single point, instead of counting each one separately. Remind them that when they lined up to find the median number of letters in their names, every student counted off, even their name has the same number of letters as their neighbors.
Select a few students to share their responses to the questions about number of siblings and Elena's travel times. Focus the discussion on the median as another measure of the center of a data set and whether it captures what students would estimate to be a typical value for each data set.
Emphasize to students that the median is a value and not an individual. For example, if the last person standing in the class has 5 letters in their first name, the median is the number 5 and not the person standing. If there is another student who had 5 letters in their name, they might have switched places with the last person standing when lining up initially. Although the person standing changed, the median remains the same value of 5.
At this point, it is unnecessary to compare the mean and the median. Students will have many more opportunities to explore the median and think about how it differs from the mean in the upcoming activities.
Design Principle(s): Cultivate conversation; Maximize meta-awarenes
6.3: Mean or Median? (15 minutes)
In the previous activity, students analyzed the effects of unusually high or low values on the mean and median. Here they study distributions (displayed using dot plots and a histogram) for which the mean and median can be the same, close, or far apart, and make conjectures about how the distributions affect the mean and median (MP7). Along the way, students recognize that the mean and median are equal or close when the distribution is roughly symmetrical and are farther apart when the distribution is non-symmetrical.
Arrange students in groups of 3–4. Provide each group with a cut-up set of cards from the blackline master. Give groups 4–5 minutes to take turns sorting the cards and completing the first two problems. Then, pause the activity to discuss the sorting decisions and observations of the class.
Ask a few groups how they sorted the cards. If not mentioned by students, highlight that in three of the distributions, the mean and median of the data are approximately equal. In the other three distributions, the mean and median are quite different. Discuss:
- “What do you notice about the shape and features of distributions that have roughly equal mean and median?” (They are roughly symmetrical and each have one peak in the middle, with roughly the same number of values to the left and right. They may have gaps, but the gaps are somewhat evenly spaced out.)
- “What about the shape and features of a distribution that has very different mean and median?” (They are not at all symmetrical. They may have one peak, but it is off to one side, or they don't really show any peaks. They may have gaps or data values that are unusually high or low. There is more variability in these data sets.)
- “In the second group, why might the mean and the median be so different?” (The mean is pulled toward the direction of unusually large or small values. The median simply tells us where the middle of the data lies when sorted, so it is not as affected by these values that are far from where most data points are.)
Afterwards, give students another 3–4 minutes to answer and discuss the remaining questions with their group.
Supports accessibility for: Language; Organization
Design Principle(s): Support sense-making
Your teacher will give you six cards. Each has either a dot plot or a histogram. Sort the cards into two piles based on the distributions shown. Be prepared to explain your reasoning.
Discuss your sorting decisions with another group. Did you have the same cards in each pile? If so, did you use the same sorting categories? If not, how are your categories different?
Pause here for a class discussion.
Use the information on the cards to answer the following questions.
- Card A: What is a typical age of the dogs being treated at the animal clinic?
- Card B: What is a typical number of people in the Irish households?
- Card C: What is a typical travel time for the New Zealand students?
- Card D: Would 15 years old be a good description of a typical age of the people who attended the birthday party?
- Card E: Is 15 minutes or 24 minutes a better description of a typical time it takes the students in South Africa to get to school?
- Card F: Would 21.3 years old be a good description of a typical age of the people who went on a field trip to Washington, D.C.?
- How did you decide which measure of center to use for the dot plots on Cards A–C? What about for those on Cards D–F?
Are you ready for more?
Most teachers use the mean to calculate a student’s final grade, based on that student’s scores on tests, quizzes, homework, projects, and other graded assignments.
Diego thinks that the median might be a better way to measure how well a student did in a course. Do you agree with Diego? Explain your reasoning.
Teachers with a valid work email address can click here to register or sign in for free access to Extension Student Response.
Use the whole-class discussion to reinforce the idea that the distribution of a data set can tell us which measure of center best summarizes what is typical for the data set. Briefly review the answers to the statistical questions, and then focus the conversation on the last questions (how students knew which measure of center to use in each situation). Select a couple of students to share their responses. Discuss:
- “For data sets with non-symmetrical distributions, why does the median turn out to be a better measure of center for non-symmetrical data sets?” (Non-symmetrical data sets often have unusual values that pull the mean away from the center of data. The median is less influenced by these values.)
- “Does it matter which measure we choose to describe a typical value? For example, in Card F, would it matter if we said that a typical age for the people who went on the field trip to D.C. was about 21 years old?” (Yes, it does matter in some cases. In that example, it wouldn’t really make sense to say that 21 years is a typical age because the vast majority of the people on the trip were teenagers.)
In this lesson, we learn about another measure of center called the median. The discussion should focus on what the median is, how to find it, and why it is more useful in some situations.
- “What is the median?” (The number in the middle of an ordered list of data.)
- “How can we find it?” (We order the data values from least to greatest and find the value in the middle.)
- “Is the median always one of the values in the data set? If not, when is it not?” (No. When the number of values in a data set is even, there will be two middle values. The median is the number exactly between them which may not be a value in the data set.)
- “What does the median tell you about a data set? Why is it used as a measure of the center of a distribution?” (It tells us where to divide a data set so that half of the data points have that value or smaller values and the other half have that value or larger.)
- “Why do we need another measure of center other than the mean?” (Sometimes the mean is not a good indication of what is typical for the data set.)
- “When are the mean and median likely to be close together?” (When the distribution is approximately symmetrical.)
- “When are they likely to be different?” (When the distribution is not roughly symmetrical or has unusually high or low values that are far from others.)
- “Why might the median be a more useful measure of center when the distribution is not symmetrical?” (Values far from the middle tend to have a greater influence on the mean than the median, so individual values can have a greater impact.)
6.4: Cool-down - Which Measure of Center to Use? (5 minutes)
Teachers with a valid work email address can click here to register or sign in for free access to Cool-Downs.
Student Lesson Summary
The median is another measure of center of a distribution. It is the middle value in a data set when values are listed in order. Half of the values in a data set are less than or equal to the median,and half of the values are greater than or equal to the median.
To find the median, we order the data values from least to greatest and find the number in the middle.
Suppose we have 5 dogs whose weights, in pounds, are shown in the table. The median weight for this group of dogs is 32 pounds because three dogs weigh less than or equal to 32 pounds and three dogs weigh greater than or equal to 32 pounds.
Now suppose we have 6 cats whose weights, in pounds, are as shown in the table. Notice that there are two values in the middle: 7 and 8.
The median weight must be between 7 and 8 pounds, because half of the cats weigh less or equal to 7 pounds and half of the cats weigh greater than or equal to 8 pounds.
In general, when we have an even number of values, we take the number exactly in between the two middle values. In this case, the median cat weight is 7.5 pounds because \((7+8)\div 2=7.5\).
Here is a set of 30 cookies. It has a mean weight of 21 grams, but the median weight is 23 grams.
In this case, the median is closer to where most of the data points are clustered and is therefore a better measure of center for this distribution. That is, it is a better description of a typical cookie weight. The mean weight is influenced (in this case, pulled down) by a handful of much smaller cookies, so it is farther away from most data points.
In general, when a distribution is symmetrical or approximately symmetrical, the mean and median values are close. But when a distribution is not roughly symmetrical, the two values tend to be farther apart. Because the mean is fairly influenced by each value in the data set, it is generally preferred for distributions where it makes sense to use it. In cases when the distribution is less symmetric, the median is often reported as the typical value.