6.+5-number+summary+and+box+plots

=5-Number Summary=

What is a 5-number summary?

 * A 5-number summary consists of five values: the maximum values, the minimum values (both extreme values), the upper and lower quartiles, and the median . They are ordered from lowest to highest: minimum value, lower quartile, median, upper quartile, maximum value (see Figure 1).

The 5 parts of 5 number summaries:

 * Range - difference between the upper and lower extremes
 * Median - found by arranging the data in increasing to decreasing order. The median is the middle number in a set of data; if there are 2 middle numbers, find the average between the 2
 * Quartile - 1/4 of the data set, found when you take the median of the set of data, and divide those 2 halves at the median of each of them
 * Outliers - extreme values that are more than 1.5 times the interquartile range beyond the upper and lower quartiles
 * Interquartile range - the difference between the first quartile point and the third quartile point

Example of creating 5 number summaries:
Table 1. Number of registered marriages 1 || **Number of marriages** 40,650 || Using the information from the table above, find: a) the range b) the median c) the upper and lower quartiles d) the inter-quartile range e) the five-number summary
 * **Year**
 * 2 || 40,812 ||
 * 3 || 41,300 ||
 * 4 || 41,450 ||
 * 5 || 39,594 ||
 * 6 || 40,734 ||
 * 7 || 39,993 ||
 * 8 || 38,814 ||
 * 9 || 37,828 ||
 * 10 || 35,716 ||

ANSWERS: a) 5,734 (range is the difference between the upper and lower extremes) b) 40.321.5 (the median is the middle number of the data. In this situation, there is an even number of data points, so you would have to find the average between the 5th and 6th numbers.) c) Q1 = 38,814 Q3 = 40,812 (Q1 is the median of the lower half; Q3 is the median of the upper half) d) Q1 – Q3 = 40,812 – 38,814 = 1,998 e) 35,716, 38,814, 40,321, 40,812, 41,450 (lower extreme, Q1, median, Q3, upper extreme) (provided by website 4) = = =Box Plots=

What is a Box Plot?

 * Also called "box-and-whisker" plots
 * A box plot graphs 5-number summaries.
 * It shows a distribution of the data set, but not as precisely as a leaf plot or histogram

When would you use a box plot?

 * When there's a large number of data points, but you only need to see a trend.
 * When you want to compare 2 or more data sets.

Why is it useful?

 * It is useful for finding outliers, and detecting a general trend for the data set.
 * The median and the range is visually apparent.

How do you construct a box plot?
For example: = = (website 4)
 * Plot the median, quartiles, and extremes above or below a number line (not on a number line)
 * Then, draw a box extending from the lower quartile to the upper quartile. Draw the whiskers (just lines) attaching the quartiles to the extremes. Also draw a vertical line through the box at the median, and plot any major outliers (points not immediately connected to the plot).

=Examples= We'll give you a situation and walk you thorough how to make your 5-number summary and how to apply that to a box plot.


 * situation:** Mr. Killian's HMA students received the following scores on their most recent math test: 80, 75, 90, 95, 65, 65, 80, 85, 70, 100. Construct a box plot for the data.

Examples of 5-number summary
1. Write the data in ascending order and find the first quartile, the median, the third quartile, the smallest value and the largest value.

65, 65, 70, 75, 80, 80, 85, 90, 95, 100 median (of the middle quartile) = 80 point where the first quartile ends = 70 point where the third quartile ends = 90 smallest value = 65 largest value = 100

2. Place a circle beneath each of these values on a number line. 

Examples of box plots
3. Draw a box with ends through the points for the first and third quartiles. Then draw a vertical line through the box at the median point. Then, draw the whiskers (or lines) from each end of the box to the smallest and largest values. 2

=**SAT/AP Style Questions**= = = Question 1: The table shows the percentage of scores obtained by Ethan each year during his four year degree course. Which of the following is the equivalent box-and-whisker plot of the data? Also find out the median of scores obtained.
 * Year || Percentage of scores ||
 * 1st Year || 70 ||
 * 2nd Year || 82 ||
 * 3rd Year || 76 ||
 * 4th Year || 80 ||

is the answer... ?

[Simplify.] [Simplify.] [Simplify.]
 * Steps to derive:**
 * 1.** The ascending order of the percentages obtained by Ethan is 70, 76, 80, 82.
 * 2.** In the percentages, the least value is 70 and the greatest value is 82.
 * 3.** Middle quartile = Median of all the data
 * 4. (76+80)2 = 1562 = 78 **
 * 5.** Lower quartile = Median of lower half of the data
 * 6. (70+76)2 = 1462 = 73 **
 * 7.** Upper quartile = Median of upper half of the data
 * 8. (80+82)2 = 1622 = 81 **
 * 9.** So, plot (1) is the equivalent box-and-whisker plot of the data.
 * 10.** Median of the scores obtained by Ethan = Middle quartile = 78


 * Hence the right answer is Plot 1. ¹ **

= = Question 2: Consider the boxplot below



Which of the following statements are true? I. The distribution is skewed right. II. The interquartile range is about 8. III. The median is about 10. (A) I only (B) II only (C) III only (D) I and III (E) II and III

Solution: The correct answer is (B). Most of the observations are on the high end of the scale, so the distribution is skewed left. The interquartile range is indicated by the length of the box, which is 18 minus 10 or 8. And the median is indicated by the vertical line running through the middle of the box, which is roughly centered over 15. So the median is about 15. Therefore only II is correct.

** Hence the right answer is (B). (website 3) **
=References=

= ¹ Ethan Problem **Box and Whisker Plots**  = 2 ** Example Website ** 3 [|Example Website] 4[|Example and information website]