"Semi-Parametric Estimation and Inference for the Mean Outcome of the S" by Oleg Sofrygin and Mark J. van der Laan

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Semi-Parametric Estimation and Inference for the Mean Outcome of the Single Time-Point Intervention in a Causally Connected Population

Authors

Oleg Sofrygin, Division of Biostatistics, University of California - BerkeleyFollow
Mark J. van der Laan, Division of Biostatistics, University of California - BerkeleyFollow

Abstract

We study the framework for semi-parametric estimation and statistical inference for the sample average treatment-specific mean effects in observational settings where data are collected on a single network of connected units (e.g., in the presence of interference or spillover). Despite recent advances, many of the current statistical methods rely on estimation techniques that assume a particular parametric model for the outcome, even though some of the most important statistical assumptions required by these models are most likely violated in the observational network settings, often resulting in invalid and anti-conservative statistical inference. In this manuscript, we rely on the recent methodological advances for the targeted maximum likelihood estimation (TMLE) of causal effects in a network of causally connected units, to describe an estimation approach that permits for more realistic classes of data-generative models and provides valid statistical inference in the context of network-dependent data. The approach is applied to an observational setting with a single time point stochastic intervention. We start by assuming that the true observed data-generating distribution belongs to a large class of semi-parametric statistical models. We then impose some restrictions on the possible set of the data-generative distributions that may belong to our statistical model. For example, we assume that the dependence among units can be fully described by the known network, and that the dependence on other units can be summarized via some known (but otherwise arbitrary) summary measures. We show that under our modeling assumptions, our estimand is equivalent to an estimand in a hypothetical iid data distribution, where the latter distribution is a function of the observed network data-generating distribution. With this key insight in mind, we show that the TMLE for our estimand in dependent network data can be described as a certain iid data TMLE algorithm, also resulting in a new simplified approach to conducting statistical inference. We demonstrate the validity of our approach in a network simulation study. We also extend prior work on dependent-data TMLE towards estimation of novel causal parameters, e.g., the unit-specific direct treatment effects under interference and the effects of interventions that modify the initial network structure.

Disciplines

Biostatistics

Suggested Citation

Sofrygin, Oleg and van der Laan, Mark J., "Semi-Parametric Estimation and Inference for the Mean Outcome of the Single Time-Point Intervention in a Causally Connected Population" (December 2015). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 344.
https://biostats.bepress.com/ucbbiostat/paper344

Download

Included in

Biostatistics Commons

COinS

Collection of Biostatistics Research Archive

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

UCB Biostatistics

Collection of Biostatistics Research Archive

U.C. Berkeley Division of Biostatistics Working Paper Series

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner

UCB Biostatistics