We compare spline and kernel methods for clustered/longitudinal data. For independent data, it is well known that kernel methods and spline methods are essentially asymptotically equivalent (Silverman, 1984). However, the recent work of Welsh, et al. (2002) shows that the same is not true for clustered/longitudinal data. First, conventional kernel methods fail to account for the within- cluster correlation, while spline methods are able to account for this correlation. Second, kernel methods and spline methods were found to have different local behavior, with conventional kernels being local and splines being non-local. To resolve these differences, we show that a smoothing spline estimator is asymptotically equivalent to a recently proposed seemingly unrelated kernel estimator of Wang (2003) for any working covariance matrix. To gain insight into this asymptotic equivalence, we show that both the seemingly unrelated kernel estimator and the smoothing spline estimator using any working covariance matrix can be obtained iteratively by applying conventional kernel or spline smoothing to pseudo-observations. This result allows us to study the asymptotic properties of the smoothing spline estimator by deriving its asymptotic bias and variance. We show that smoothing splines are asymptotically consistent for an arbitrary working covariance and have the smallest variance when assuming the true covariance. We further show that both the seemingly unrelated kernel estimator and the smoothing spline estimator are nonlocal (unless working independence is assumed) but have asymptotically negligible bias. Their finite sample performance is compared through simulations. Our results justify the use of efficient, non-local estimators such as smoothing splines for clustered/longitudinal data.


Longitudinal Data Analysis and Time Series | Statistical Methodology | Statistical Models | Statistical Theory