Abstract

In regression analysis with a continuous and positive dependent variable, a multiplicative relationship between the unlogged dependent variable and the independent variables is often specified. It can then be estimated on its unlogged or logged form. The two procedures may yield major differences in estimates, even opposite signs. The reason is that estimation on the unlogged form yields coefficients for the relative arithmetic mean of the unlogged dependent variable, whereas estimation on the logged form gives coefficients for the relative geometric mean for the unlogged dependent variable (or for absolute differences in the arithmetic mean of the logged dependent variable). Estimated coefficients from the two forms may therefore vary widely, because of their different foci, relative arithmetic versus relative geometric means. The first goal of this article is to explain why major divergencies in coefficients can occur. Although well understood in the statistical literature, this is not widely understood in sociological research, and it is hence of significant practical interest. The second goal is to derive conditions under which divergencies will not occur, where estimation on the logged form will give unbiased estimators for relative arithmetic means. First, it derives the necessary and sufficient conditions for when estimation on the logged form will give unbiased estimators for the parameters for the relative arithmetic mean. This requires not only that there is arithmetic mean independence of the unlogged error term but that there is also geometric mean independence. Second, it shows that statistical independence of the error terms on regressors implies that there is both arithmetic and geometric mean independence for the error terms, and it is hence a sufficient condition for absence of bias. Third, it shows that although statistical independence is a sufficient condition, it is not a necessary one for lack of bias. Fourth, it demonstrates that homoskedasticity of error terms is neither a necessary nor a sufficient condition for absence of bias. Fifth, it shows that in the semi-logarithmic specification, for a logged error term with the same qualitative distributional shape at each value of independent variables (e.g., normal), arithmetic mean independence, but heteroskedasticity, estimation on the logged form will give biased estimators for the parameters for the arithmetic mean (whereas with homoskedasticity, and for this case thus statistical independence, estimators are unbiased, from the second result above).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.