April 20, 2004

Specification Testing

We capped our discussion of regressions with an overview of specification testing. Including variables in a regression "controls for" various effects on the dependent (left-hand side) variable. Leaving out a variable that should be in a regression makes it likely that the estimated coefficients for the variables that are in the equation will "pick up" the left-out influence to some extent.

Let us frame the discussion in terms of whether or not we should include the variable z. The formula for the coefficient for x is pretty complicated under the alternative hypothesis that z belongs in the equation, but we were satisfied just to note that all kinds of covariances are involved. Regression analysis looks at lots of relationships among variables before producing coefficient estimates.

P1010001d.jpg

If we leave out z by mistake, we will get "left-out variable bias" in the coefficient for x. We can analyze this by substituting into the formula for that coefficient and applying algebra. The end result is that the magnitude of the left-out variable bias depends on the covariance between x and z and on the true coefficient for z.

P1010004d.jpg

Examples where specification testing could be important:

P1010005d.jpg

P1010008a.jpg

The people who were not in class on Tuesday are required to memorize these equations for the final exam.

Posted by bparke at 08:40 PM