Skip to content

Publication

On Omitted Variables, Proxies, and Unobserved Effects in Empirical Regression Analysis

Abstract

"Any result from regression analysis may be subject to omitted variable bias of unknown magnitude and direction as, in practice, no dataset contains all the variables of the population model. At the same time, many variables are irrelevant and don’t contribute to the analysis. This paper explores which combination of data sources or structures will produce the best results and should be made available to the research community. We present a unified statistical framework that nests and comparable sets of constraints that characterize the effectiveness of these approaches in reducing omitted variable bias. We demonstrate our framework by estimating a wage and labor market transition model using German administrative data with a large set of linked survey variables. Overall, we find that unobserved effects panel data models with a restricted set of regressors are preferable to cross-sectional analysis with an extended set of variables. Consequently, we recommend that data providers supply administrative panel data for key variables instead of conducting extensive cross-sectional surveys." (Author's abstract, IAB-Doku) ((en))

Cite article

Du, S., Wilke, R. & Homrighausen, P. (2025): On Omitted Variables, Proxies, and Unobserved Effects in Empirical Regression Analysis. In: Journal of Official Statistics, p. 1-20. DOI:10.1177/0282423x241312644