We capped our discussion of regressions with an overview of specification testing. Including variables in a regression "controls for" various effects on the dependent (left-hand side) variable. Leaving out a variable that should be in a regression makes it likely that the estimated coefficients for the variables that are in the equation will "pick up" the left-out influence to some extent.
Let us frame the discussion in terms of whether or not we should include the variable z. The formula for the coefficient for x is pretty complicated under the alternative hypothesis that z belongs in the equation, but we were satisfied just to note that all kinds of covariances are involved. Regression analysis looks at lots of relationships among variables before producing coefficient estimates.

If we leave out z by mistake, we will get "left-out variable bias" in the coefficient for x. We can analyze this by substituting into the formula for that coefficient and applying algebra. The end result is that the magnitude of the left-out variable bias depends on the covariance between x and z and on the true coefficient for z.

Examples where specification testing could be important:


The people who were not in class on Tuesday are required to memorize these equations for the final exam.
Posted by bparke at April 20, 2004 08:40 PM