Abstract
This paper studies nonparametric series estimation and inference for the effect of a single variable of interest x on an outcome y in the presence of potentially high-dimensional conditioning variables z. The context is an additively separable model E[y|x, z] = g0(x) + h0(z). The model is high-dimensional in the sense that the series of approximating functions for h0(z) can have more terms than the sample size, thereby allowing z to have potentially very many measured characteristics. The model is required to be approximately sparse: h0(z) can be approximated using only a small subset of series terms whose identities are unknown. This paper proposes an estimation and inference method for g0(x) called Post-Nonparametric Double Selection which is a generalization of Post-Double Selection. Standard rates of convergence and asymptotic normality for the estimator are shown to hold uniformly over a large class of sparse data generating processes. A simulation study illustrates finite sample estimation properties of the proposed estimator and coverage properties of the corresponding confidence intervals. Finally, an empirical application estimating convergence in GDP in a country-level crosssection demonstrates the practical implementation of the proposed method.