## Tuesday, April 23, 2013

A way to measure the spread of data about the mean is by using the standard deviation. If the spread is small, the standard deviation is small. If the spread is large, the standard deviation is large. The proportion of the data that falls within a certain number of standard deviations from the mean is very definite and precise with a bell shaped curve. But what can be said about the proportion of data spread about the mean for other distributions such as skewed, symmetric, or other shapes? Chebyshev's Theorem will solve this problem.

The basis of Chebyshev's Theorem is that no matter how large or small a data set is from a population or a sample, the proportion of data that lies within k standard deviations is at least 1 - 1/k2. For k = 2, the proportion is 1 - 1/4 = .75. For k = 3, the proportion is 1 - 1/9 = .889. For k = 4, the proportion is 1 - 1/16 = .938. These results mean that at least 75% of the data must fall within 2 standard deviations from the mean, 88.9% must fall within 3 standard deviations from the mean, and at least 93.8% must fall within 4 standard deviations from the mean.

Many distributions will have much greater percentages of the data falling within specified intervals. For example, in the well known normal distribution, which is bell shaped, 95% of the data falls within 2 standard deviations, 99.7% falls within 3 standard deviations from the mean, and virtually 100% falls within 4 standard deviations from the mean.

Here's an example using Chebyshev's Theorem.

Suppose students at a local college volunteer to work on community projects, such as cleaning parks, renovating playgrounds, and planting trees. A professor in charge of the program kept track of the time in hours that each student worked. Suppose a random sample of x students in the program were picked and the mean hours the students worked was 24.5 hours, and the standard deviation was 1.4 hours. From this information we can find an interval which at least 75%, 88.9% and 93.8% of the students worked.

Interval which at least 75% worked is 24.5 +/- 2(1.4) = 21.7 to 27.3 hours.
Interval which at least 88.9% worked is 24.5 +/- 3(1.4) = 20.3 to 28.7 hours.
Interval which at least 93.8% worked is 24.5 +/- 4(1.4) = 18.9 to 30.1 hours.

This guide should help students better understand Chebyshev's Theorem and how it can be applied.