Stouffer, S. A. (1936). Evaluating the effect of inadequately measured variables in partial correlation analysis. Journal of the American Statistical Association, 31(194), 348-360.

My sense is that, despite the long history, it remains the case that not a lot of people outside of epidemiology, biostatistics, and psychometrics are aware of these issues. So one of the main functions of our paper is to yet again attempt to raise awareness. But for those who are at least vaguely aware of the issues already, I view the novel contributions of this paper as being basically three things.

First, while methodologists might be familiar with the idea that this is a theoretical problem that can exist in many contexts, few seem to appreciate the scope and magnitude of the problem in practice. What we clearly show is that, in many entirely realistic research situations, the problem can be pretty damn bad, and it tends to get worse (not better) as sample sizes grow.

Second, we point out that there are slightly more subtle forms of the basic incremental validity argument that people haven’t recognized (at least in publication) as suffering from the same general problem. To take the implicit/explicit attitude example from social psychology, we might seek to show that implicit and explicit political attitudes both significantly predict voting intentions, even after controlling for each other. While it is perhaps of some interest that we can predict voting intentions, the more theoretically interesting point (to a social psychologist) is that this would seem to indicate that implicit and explicit attitudes must, in fact, be separable psychological constructs, and not simply two ways of measuring the same thing. But this statistical argument is subject to all the same problems as the more classic incremental validity argument.

Third, we show that while one can perform a more correct statistical analysis that does a good job of controlling the Type 1 error rates, the trade-off is that the Type 2 error rates for this corrected analysis can be extremely high. Basically, if your study exists in a bad part of the parameter space (where reliability of the predictors is not great and measured confounds have strong effects), then it is just inherently difficult to make a good statistical case for the incremental validity of predictor constructs.

]]>