Outliers are data values that are much larger or smaller than other data values in your data set. Outliers can skew basic statistical calculations such as average and standard deviation, so it is common in statistical analysis to identify outliers in the data set before performing statistical analysis.
Video of the Day
Microsoft Excel does not have a built-in Outliers function because there are many different methods used to find outliers. But you can combine two Excel functions to find outliers in your data using one of the most common and simple methods.
We'll use a sample data set of 15 points to demonstrate this. Our data has two values that are much larger than the other values and one that is much smaller than the other values.
Calculate the first and third quartiles of the data using the Excel Quartile function (click here for ). In our example, these are labeled Quartile 1 (Q1) and Quartile 3 (Q3).
Subtract Quartile 1 (Q1) from Quartile 3 (Q3). The result is called the Interquartile Range; in our example this is labeled Interquartile Range (IQR).
Add 1.5 times the Interquartile Range (IQR) to Quartile 3 (Q3); this is labeled Upper Bound. Subtract 1.5 times the Interquartile Range (IQR) from Quartile 1 (Q1); this is labeled Lower Bound.
If the first value in the data set is larger than the Upper Bound or smaller than the Lower Bound then it's an outlier. Use the Excel OR function to compare the first value to the Upper Bound and Lower Bound at the same time.
After the formula is entered next to your first data value, you can simply copy the formula to determine if your other data values are outliers. To copy the formula for all your data values:
- Enter the First Value Outlier formula into the cell to the right of your first data value.
- Highlight the cell where you copied the formula.
- Click the square in the bottom-right corner of the highlighted cell and drag it until all the cells next to your data values are selected, then release the mouse button.
The cell next to each data value displays TRUE or FALSE. The data values to the left of cells displaying TRUE are outliers.