What are outliers?

From PsychWiki - A Collaborative Psychology Wiki

(Difference between revisions)
Jump to: navigation, search
Stenstro (Talk | contribs)
Line 1: Line 1:
*'''What are outliers?'''
*'''What are outliers?'''
-
Outliers are extreme values as compared to the rest of the data.  
+
:Outliers are extreme values as compared to the rest of the data.  
*'''What does "extreme" mean?'''
*'''What does "extreme" mean?'''
#The determination of values as “outliers” is subjective. While there are a few benchmarks for determining whether a value is an “outlier”, those benchmarks are arbitrarily chosen, similar to how “p<u><</u>.05” is also arbitrarily chosen.
#The determination of values as “outliers” is subjective. While there are a few benchmarks for determining whether a value is an “outlier”, those benchmarks are arbitrarily chosen, similar to how “p<u><</u>.05” is also arbitrarily chosen.
-
#One benchmark is to use a boxplot to determine "mild" and "extreme" outliers. Mild outliers are any score more than 1.5*IQR from the rest of the scores, and are indicated by open dots. IQR stands for “Interquartile range”, and is the middle 50% of the scores. Extreme outliers are any score more than 3*IQR from the rest of the scores, and are indicated by stars. [[Image:Fe40.png]] -
+
#One benchmark is to use a BOXPLOT to determine "mild" and "extreme" outliers. Mild outliers are any score more than 1.5*IQR from the rest of the scores, and are indicated by open dots. IQR stands for “Interquartile range”, and is the middle 50% of the scores. Extreme outliers are any score more than 3*IQR from the rest of the scores, and are indicated by stars. [[Image:Fe40.png]] - The output below is from SPSS for a variable called "system1". A boxplot is a graphical display of the data that shows: (1) median, which is the middle black line, (2) middle 50% of scores, which is the shaded region, (3) top and bottom 25% of scores, which are the lines extending out of the shaded region, (4) the smallest and largest (non-outlier) scores, which are the horizontal lines at the top/bottom of the boxplot, and (5) outliers. For this variable, there is 1 mild outlier (subject #52) and 1 extreme outlier (subject #18).
 +
<center>[[Image:System1_boxplot0.png]]</center>
 +
 
 +
 
 +
 
 +
 
 +
 
-
The boxplot is a graphical display of the data that shows: (1) median, which is the middle black line, (2) middle 50% of scores, which is the shaded region, (3) top and bottom 25% of scores, which are the lines extending out of the shaded region, (4) the smallest and largest (non-outlier) scores, which are the horizontal lines at the top/bottom of the boxplot, and (5) outliers.
 

Revision as of 04:14, 16 February 2008

Outliers are extreme values as compared to the rest of the data.


  1. The determination of values as “outliers” is subjective. While there are a few benchmarks for determining whether a value is an “outlier”, those benchmarks are arbitrarily chosen, similar to how “p<.05” is also arbitrarily chosen.
  2. One benchmark is to use a BOXPLOT to determine "mild" and "extreme" outliers. Mild outliers are any score more than 1.5*IQR from the rest of the scores, and are indicated by open dots. IQR stands for “Interquartile range”, and is the middle 50% of the scores. Extreme outliers are any score more than 3*IQR from the rest of the scores, and are indicated by stars. Fe40.png - The output below is from SPSS for a variable called "system1". A boxplot is a graphical display of the data that shows: (1) median, which is the middle black line, (2) middle 50% of scores, which is the shaded region, (3) top and bottom 25% of scores, which are the lines extending out of the shaded region, (4) the smallest and largest (non-outlier) scores, which are the horizontal lines at the top/bottom of the boxplot, and (5) outliers. For this variable, there is 1 mild outlier (subject #52) and 1 extreme outlier (subject #18).
System1 boxplot0.png







◄ Back to Research Tools mainpage

Personal tools
Namespaces
Variants
Actions
Navigation
Interaction
Toolbox