# Should I check for outliers?

### From PsychWiki - A Collaborative Psychology Wiki

(Difference between revisions)

Line 2: | Line 2: | ||

#YES -- Outliers can render your data non-normal. Since normality is one of the assumptions for many of the statistical tests you will conduct, finding and eliminating the influence of outliers may render your data normal, and thus render your data appropriate for analysis using those statistical tests. | #YES -- Outliers can render your data non-normal. Since normality is one of the assumptions for many of the statistical tests you will conduct, finding and eliminating the influence of outliers may render your data normal, and thus render your data appropriate for analysis using those statistical tests. | ||

#NO -- Just because a value is extreme compared to the rest of the data does not necessarily mean it is somehow an anomaly, or invalid, or should be removed. The subject chose to respond with that value, so removing that value is arbitrarily throwing away data simply because it does not fit this “assumption” that data should be “normal”. Conducting research is about discovering empirical reality. If the subject chose to respond with that value, then that data is a reflection of reality, so removing the “outlier” is the antithesis of why you conduct research. | #NO -- Just because a value is extreme compared to the rest of the data does not necessarily mean it is somehow an anomaly, or invalid, or should be removed. The subject chose to respond with that value, so removing that value is arbitrarily throwing away data simply because it does not fit this “assumption” that data should be “normal”. Conducting research is about discovering empirical reality. If the subject chose to respond with that value, then that data is a reflection of reality, so removing the “outlier” is the antithesis of why you conduct research. | ||

- | #MAYBE -- One solution is to analyze your data with the outlier and without the outlier because each analysis tells you separate types of information. [[Image:Fe40.png]] - Imagine a study that asks the American public how many sexual partners they have over their lifetime, | + | #MAYBE -- One solution is to analyze your data with the outlier and without the outlier because each analysis tells you separate types of information. [[Image:Fe40.png]] - Imagine a study that asks the American public how many sexual partners they have over their lifetime. See the frequency distribution below for the findings from this hypothetical study. The average number of sexual partners is 45 when you include the "outliers" which said they have 100+ sexual partners in their lifetime, but without the "outliers" the average number of sexual partners is 7. |

+ | <center>[[Image:Sexpartners_histogram0.png|400px]]</center> | ||

## Revision as of 04:40, 16 February 2008

**Should I check for outliers?**

- YES -- Outliers can render your data non-normal. Since normality is one of the assumptions for many of the statistical tests you will conduct, finding and eliminating the influence of outliers may render your data normal, and thus render your data appropriate for analysis using those statistical tests.
- NO -- Just because a value is extreme compared to the rest of the data does not necessarily mean it is somehow an anomaly, or invalid, or should be removed. The subject chose to respond with that value, so removing that value is arbitrarily throwing away data simply because it does not fit this “assumption” that data should be “normal”. Conducting research is about discovering empirical reality. If the subject chose to respond with that value, then that data is a reflection of reality, so removing the “outlier” is the antithesis of why you conduct research.
- MAYBE -- One solution is to analyze your data with the outlier and without the outlier because each analysis tells you separate types of information. - Imagine a study that asks the American public how many sexual partners they have over their lifetime. See the frequency distribution below for the findings from this hypothetical study. The average number of sexual partners is 45 when you include the "outliers" which said they have 100+ sexual partners in their lifetime, but without the "outliers" the average number of sexual partners is 7.

◄ Back to Research Tools mainpage