# What is a scatterplot?

### From PsychWiki - A Collaborative Psychology Wiki

(Difference between revisions)

(4 intermediate revisions not shown) | |||

Line 1: | Line 1: | ||

'''What is a scatterplot?''' | '''What is a scatterplot?''' | ||

* The first step of any statistical analysis is to first graphically plot the data. In terms of correlation, graphical plots are called scatterplots. Scatterplots can show you visually the strength of the relationship between the variables, the direction of the relationship between the variables, and whether outliers exist. | * The first step of any statistical analysis is to first graphically plot the data. In terms of correlation, graphical plots are called scatterplots. Scatterplots can show you visually the strength of the relationship between the variables, the direction of the relationship between the variables, and whether outliers exist. | ||

- | * | + | * Below is an example of a scatterplot between two variables - v1 and v2 |

+ | ::[[Image:Scatterplot1.png|500px]] | ||

Line 7: | Line 8: | ||

'''What is the purpose of graphing the scatterplot?''' | '''What is the purpose of graphing the scatterplot?''' | ||

* The purpose of graphing the scatterplot is to look at the relationship between the variables and determine if there are any problems/issues with the data or if the scatterplot indicates anything unique or interesting about the data, such as: | * The purpose of graphing the scatterplot is to look at the relationship between the variables and determine if there are any problems/issues with the data or if the scatterplot indicates anything unique or interesting about the data, such as: | ||

- | * How is the data dispersed? | + | * How is the data dispersed? For example, in the scatterplot above, it appears all the scores are grouped in the top right quadrant. What does this imply about the questions and/or data in your study? It appears that subjects answered both v1 and v2 on the higher part of the scale. Thus, when discussing these variables in your paper, just talking about the size and direction of the correlation does not tell the whole story. In other words, just presenting the correlational analysis (e.g., r = .35, p = <.001) may mislead the reader about an interesting distribution of the data. |

- | * Are there outliers? A scatterplot is useful for | + | * Are there outliers? A scatterplot is useful for "eyeballing" the presence of outliers. Just as a histogram is useful for "eyeballing" univariate outliers, the scatterplot is useful for "eyeballing" bivariate outliers. See [[What are outliers?]], [[How can I tell if outliers are influencing my data?]], and [[Detecting Outliers - Univariate]]. |

Line 15: | Line 16: | ||

---- | ---- | ||

- | ◄ Back to [[ | + | ◄ Back to [[Analyzing Data]] page |

## Latest revision as of 20:56, 7 September 2009

**What is a scatterplot?**

- The first step of any statistical analysis is to first graphically plot the data. In terms of correlation, graphical plots are called scatterplots. Scatterplots can show you visually the strength of the relationship between the variables, the direction of the relationship between the variables, and whether outliers exist.
- Below is an example of a scatterplot between two variables - v1 and v2

**What is the purpose of graphing the scatterplot?**

- The purpose of graphing the scatterplot is to look at the relationship between the variables and determine if there are any problems/issues with the data or if the scatterplot indicates anything unique or interesting about the data, such as:
- How is the data dispersed? For example, in the scatterplot above, it appears all the scores are grouped in the top right quadrant. What does this imply about the questions and/or data in your study? It appears that subjects answered both v1 and v2 on the higher part of the scale. Thus, when discussing these variables in your paper, just talking about the size and direction of the correlation does not tell the whole story. In other words, just presenting the correlational analysis (e.g., r = .35, p = <.001) may mislead the reader about an interesting distribution of the data.
- Are there outliers? A scatterplot is useful for "eyeballing" the presence of outliers. Just as a histogram is useful for "eyeballing" univariate outliers, the scatterplot is useful for "eyeballing" bivariate outliers. See What are outliers?, How can I tell if outliers are influencing my data?, and Detecting Outliers - Univariate.

◄ Back to Analyzing Data page