# Lesson 3

A Gallery of Data

- Let’s make, compare, and interpret data displays.

### 3.1: Notice and Wonder: Dot Plots

The dot plots represent the distribution of the amount of tips, in dollars, left at 2 different restaurants on the same night.

What do you notice? What do you wonder?

### 3.2: Data Displays

Your teacher will assign your group a statistical question. As a group:

- Create a dot plot, histogram, and box plot to display the distribution of the data.
- Write 3 comments that interpret the data.

As you visit each display, write a sentence or two summarizing the information in the display.

Choose one of the more interesting questions you or a classmate asked and collect data from a larger group, such as more students from the school. Create a data display and compare results from the data collected in class.

### Summary

We can represent a distribution of data in several different forms, including lists, dot plots, histograms, and box plots. A list displays all of the values in a data set and can be organized in different ways. This list shows the pH for 30 different water samples.

- 5.9
- 7.6
- 7.5
- 8.2
- 7.6
- 8.6
- 8.1
- 7.9
- 6.1
- 6.3
- 6.9
- 7.1
- 8.4
- 6.5
- 7.2
- 6.8
- 7.3
- 8.1
- 5.8
- 7.5
- 7.1
- 8.4
- 8.0
- 7.2
- 7.4
- 6.5
- 6.8
- 7.0
- 7.4
- 7.6

Here is the same list organized in order from least to greatest.

- 5.8
- 5.9
- 6.1
- 6.3
- 6.5
- 6.5
- 6.8
- 6.8
- 6.9
- 7.0
- 7.1
- 7.1
- 7.2
- 7.2
- 7.3
- 7.4
- 7.4
- 7.5
- 7.5
- 7.6
- 7.6
- 7.6
- 7.9
- 8.0
- 8.1
- 8.1
- 8.2
- 8.4
- 8.4
- 8.6

With the list organized, you can more easily:

- interpret the data
- calculate the values of the five-number summary
- estimate or calculate the mean
- create a dot plot, box plot, or histogram

Here is a dot plot and histogram representing the distribution of the data in the list.

A dot plot is created by putting a dot for each value above the position on a number line. For the pH dot plot, there are 2 water samples with a pH of 6.5 and 1 water sample with a pH of 7. A histogram is made by counting the number of values from the data set in a certain interval and drawing a bar over that interval at a height that matches the count. In the pH histogram, there are 5 water samples that have a pH between 6.5 and 7 (including 6.5, but not 7). Here is a box plot representing the distribution of the same data as the dot plot and histogram.

To create a box plot, you need to find the minimum, first quartile, median, third quartile, and maximum values for the data set. These 5 values are sometimes called the *five-number summary*. Drawing a vertical mark and then connecting the pieces as in the example creates the box plot. For the pH box plot, we can see that the minimum is about 5.8, the median is about 7.4, and the third quartile is around 7.9.

### Video Summary

### Glossary Entries

**categorical data**Categorical data are data where the values are categories. For example, the breeds of 10 different dogs are categorical data. Another example is the colors of 100 different flowers.

**distribution**For a numerical or categorical data set, the distribution tells you how many of each value or each category there are in the data set.

**five-number summary**The five-number summary of a data set consists of the minimum, the three quartiles, and the maximum. It is often indicated by a box plot like the one shown, where the minimum is 2, the three quartiles are 4, 4.5, and 6.5, and the maximum is 9.

**non-statistical question**A non-statistical question is a question which can be answered by a specific measurement or procedure where no variability is anticipated, for example:

- How high is that building?
- If I run at 2 meters per second, how long will it take me to run 100 meters?

**numerical data**Numerical data, also called measurement or quantitative data, are data where the values are numbers, measurements, or quantities. For example, the weights of 10 different dogs are numerical data.

**statistical question**A statistical question is a question that can only be answered by using data and where we expect the data to have variability, for example:

- Who is the most popular musical artist at your school?
- When do students in your class typically eat dinner?
- Which classroom in your school has the most books?