Sunday, May 5, 2013

Computing Quartiles and Box-and-Whisker Plots


In statistics, many times data is summarized by its mean and standard deviation. But in some instances, such as when the data is heavily skewed or bimodal, we may be more interested in relative position of data instead of precise values. In such cases percentiles are used. Special percentiles frequently used, particular in box-and-whisker plots, are known as quartiles.

Recall that the median of a set of data is the middle value when ordered in ascending or descending order. Since half the data fall above the median and half the data fall below the median, the median is the 50th percentile. In general, the Pth percentile of a distribution is the value where P% of the data falls at or below it and the rest of the data falls above it.

We can expand on the topic of percentile by introducing special percentiles known as quartiles, which are percentiles that divide the data into fourths. The first quartile, known as Q1 is the 25th percentile, the second quartile, known as Q2 is the median, and the third quartile, known as Q3 is the 75th percentile. It is important to know how to find Q1, Q2, and Q3 in order to draw a box-and-whisker plot.

To compute quartiles, first order the data from smallest to largest. Next, find the median. The first quartile is the median of the lower half of the data. The third quartile is the median of the upper half of the data. The interquartile range, Q3 - Q1, gives the spread of the middle half of the data.

Example: Find Q1, Q2, Q3, and the interquartile range of the following set of data.

5, 10, 8, 9, 10, 2, 5, 6, 11, 15, 8

First, we order the data from smallest to largest: 2, 5, 5, 6, 8, 8, 9, 10, 10, 11, 15. The median is 8 since half 2, 5, 5, 6, 8 falls at or below it and 9, 10, 10, 11, 15 falls at or above it. The median of the lower half of the data is 5, which is Q1. The median of the upper half of the data is 10, which is Q3. The interquartile range is Q3 - Q1, which is 5.

The quartiles are used with the maximum and minimum values of a data set to create a box-and-whisker plot. These are very useful to describe a data set. To make a box-and-whisker plot, draw a vertical scale that includes the highest and lowest data values. Then mark Q1, the median and Q3. Draw a box around Q1 and Q3. Draw a line through the box where the median is. The whiskers are then drawn, which are vertical lines from Q3 to the maximum value and from Q1 to the lowest value.

In the data set above, the box would be around 5 and 10. The median line is through 8 in the box. The whiskers are drawn from Q1 to the lowest data value of 2, and from Q3 to the largest data value of 15.

This guide should help students learn the basics about percentiles, quartiles and constructing box-and-whisker plots.

No comments:

Post a Comment