How do I know when it is appropriate to create a composite?
From PsychWiki - A Collaborative Psychology Wiki
Beyond How do I create a composite of items? and Why create a composite of items?, a related question is how do I know when it is *appropriate* to create a composite given my data? You can answer that question from a theoretical point of view, and a statistical point of view.
- From a theoretical point of view
- From a theoretical point of view, it is possible your questions do not measure the same construct, and thus it is inappropriate to average them together. For example, the face content of each item may measure different concepts.
- Imagine questions designed to measure your political group orientation. A question about whether you "think" of yourself as a republican or democrat, may tap a different construct then if you ask whether you "feel" like a republican or democrat. You need to examine your questions and make a determination of whether you feel its appropriate to average the items together based upon the construct measured by each item.
- Another option is create separate composites, one for each concept that is measured.
- Maybe you composite together all the questions about how you "feel" about your political group membership, and create another composite of the questions about how you "think" of your political group membership. After creating the separate composites, you can then also merge all the questions together (so merge all the separate composites together) into 1 big composite. In this case, you would call the separate composites you merged together the "sub-parts" or "sub-factors" of the 1 big composite. Also, from a theoretical point of view you need to decide how to label or characterize this big composite.
- It is acceptable to create composites from a theoretical point of view even if it is not appropriate from a statistical point of view. Below is the discussion of the benchmarks for deciding whether or not its statistically appropriate to merge items together into a composite, but assuming those benchmarks are not met in your data, it is still appropriate to merge items together from a purely theoretical point of view. However, you must state in your manuscript that the statistical benchmarks were not met, and then explain the theoretical basis for why you are still merging the items together. (FYI – if the statistical benchmarks are met, then you rarely see researchers explain the theoretical basis for why the items were merged together.)
- From a theoretical point of view, it is possible your questions do not measure the same construct, and thus it is inappropriate to average them together. For example, the face content of each item may measure different concepts.
- From a statistical point of view
- From a statistical point of view, it is possible your questions do not measure the same construct, and thus it is inappropriate to average them together. For example, you can use "Factor Analysis" to determine if the items fall into 1 big composite (called a "factor"), or if they fall into separate sub-factors. Alternatively, researchers decide how the items group together from a theoretical point of view, and then proceed to test their judgment by conducting "Reliability Analysis", which provides a benchmark for determining whether or not the items group together statistically. In other words, Reliability Analysis is called a "confirmatory" test because its confirming your decisions, whereas Factor Analysis is typically considered a "exploratory" test because it is used to explore which, if any, of the items group together into which set of factors or sub-factors.
- Standard statistical packages like SPSS and SAS have the ability to test for both "Reliability Analysis" and "Factor Analysis". In Reliability Analysis, the output is usually Cronbach's Alpha which is a measure of the internal consistency amongst the items.
- Cronbach's Alpha ranges from 0 to 1, and the higher the number, the stronger the items group together statistically. Any alpha greater than .07 is suitable from a statistical point of view to create a composite of the items. More specifically, Alphas above .9 are great, above .8 are good, above .7 are ok, above .6 are borderline acceptable. Also, the smaller the sample, the more likely you will find smaller Alpha levels because there is less data to identify intercorrelations. In smaller samples, smaller Alpha levels are acceptable to create composites.
◄ Back to Analyzing Data page