How to Use the Outliers Function in Excel
Excel does not have a built-in Outliers function, but you can calculate outliers in your data by combining two built-in Excel functions.
Outliers are data values that are much larger or smaller than other data values in your data set. Outliers can skew basic statistical calculations such as average and standard deviation, so it is common in statistical analysis to identify outliers in the data set before performing statistical analysis.
Microsoft Excel does not have a built-in Outliers function because there are many different methods used to find outliers. But you can combine two Excel functions to find outliers in your data using one of the most common and simple methods.
We'll use a sample data set of 15 points to demonstrate this. Our data has two values that are much larger than the other values and one that is much smaller than the other values.
The calculation to determine outliers only gives accurate results for relatively large data sets with a normal distribution. A small data set is used in this example only to illustrate the calculations. But your results may not be accurate if you have a small data set or if your data does not follow a normal distribution.
Calculate the first and third quartiles of the data using the Excel Quartile function (click here for How to Use the Quartile Function in Excel). In our example, these are labeled Quartile 1 (Q1) and Quartile 3 (Q3).
To calculate the quartiles the data must be sorted from lowest to highest. But the Quartile function automatically sorts the data for you, so you don't need to sort your data before using this function.
Subtract Quartile 1 (Q1) from Quartile 3 (Q3). The result is called the Interquartile Range; in our example this is labeled Interquartile Range (IQR).
Add 1.5 times the Interquartile Range (IQR) to Quartile 3 (Q3); this is labeled Upper Bound. Subtract 1.5 times the Interquartile Range (IQR) from Quartile 1 (Q1); this is labeled Lower Bound.
Outliers are sometimes classified as weak outliers or strong outliers. The multiple of 1.5 x Interquartile Range is typically used to identify what are called weak outliers. To identify strong outliers, change the multiple from 1.5 x Interquartile Range to 3.0 x Interquartile Range.
If the first value in the data set is larger than the Upper Bound or smaller than the Lower Bound then it's an outlier. Use the Excel OR function to compare the first value to the Upper Bound and Lower Bound at the same time.
Here is the complete formula to test if value 1 (A2) is an outlier:
To use this formula in your own Excel spreadsheet, you must:
- Replace the cell value (A2) with the first data value in your range.
- Replace the data range in the formula (A$2:A$16) with the full range of your own data.
For example, if you have 200 rows of data beginning at the first row of column C (C1) and extending to the 200th row of column C (C200), the first data value in your range is C1 and your data range is C$1:C$200. You must modify the formula to the following:
After the formula is entered next to your first data value, you can simply copy the formula to determine if your other data values are outliers. To copy the formula for all your data values:
- Enter the First Value Outlier formula into the cell to the right of your first data value.
- Highlight the cell where you copied the formula.
- Click the square in the bottom-right corner of the highlighted cell and drag it until all the cells next to your data values are selected, then release the mouse button.
The cell next to each data value displays TRUE or FALSE. The data values to the left of cells displaying TRUE are outliers.
Instead of creating a new column to display which cells are outliers, you can also use the formula above with the Excel Conditional Formatting feature to highlight the cells with outliers in a different color or font. You can visit Microsoft Office support to learn how to use a formula to apply conditional formatting.