X hits on this document

47 views

0 shares

0 downloads

0 comments

6 / 12

There are two methods to test for outliers:

1. Using Standard Deviation

A piece of data is considered an outlier if it is more than two standard deviations away from the mean of the data set.

e.g. The mean of a set of heights is 1.54m, the standard deviation is 0.11

Is the height 1.25 in the data set is an outlier?

2xstandard deviation = 2 x 0.11 = 0.22

1.54-0.22=1.32  

1.25 is smaller than 1.32 and so is more than 2 standard deviations from the mean, making it an outlier.

2. Using Interquartile Range (IQR)

A piece of data is considered an outlier if it is more than 1.5 times the Interquartile range above the upper quartile (UQ) or below the lower quartile (LQ).

e.g.  The Lower quartile for a set of data is 6, the upper quartile is 9.5.  Is the value 15 in the data set an outlier?

IQR = UQ – LQ = 9.5 – 6 = 3.5

Multiply the IQR by 1.5:3.5 x 1.5 = 5.25

Any values more than 5.25 below the upper quartile or more than 5.25 above the upper quartile are outliers.

UQ + 5.25 = 9.5 + 5.25 = 14.75

15 is larger than 14.75 and so is an outlier.

Replacing Anomalies

If you find an outlier in your data you must deal with it and include evidence.  If it is only just an outlier in your test you may choose to leave it in, but you must explain why you have chosen to do this.  If it is a clear outlier you must remove it and replace it with a new person from your original randomised data.  Your scatter graph and calculations will automatically update, so remember to print out any graphs and calculations before deleting to use as evidence in your report.  Do not forget that if you replace a piece of data, this will also need testing (N.B. The standard deviation or

Document info
Document views47
Page views47
Page last viewedWed Dec 07 22:28:31 UTC 2016
Pages12
Paragraphs319
Words2772

Comments