For likelihood-based inferences from data with missing values, Rubin (1976) showed that the missing data mechanism can be ignored when (a) the missing data are missing at random (MAR), in the sense that missingness does not depend on the missing values after conditioning on the observed data, and (b) the parameters of the data model and the missing-data mechanism are distinct; that is, there are no a priori ties, via parameter space restrictions or prior distributions, between the parameters of the data model and the parameters of the model for the mechanism. Rubin described (a) and (b) as the "weakest simple and general conditions under which it is always appropriate to ignore the process that causes missing data". However, these conditions are not always necessary. Also, they relate to the complete set of parameters in the model, but we argue that it would be useful to have definitions of MAR and ignorability for a subset of parameters of substantive interest. We propose such definitions, and apply them to a variety of examples where the missing data mechanism is missing not at random, but MAR or ignorable for the parameter subset.


Applied Statistics | Biometry | Biostatistics | Statistics and Probability