We describe the R package multiPIM, including statistical background, functionality and user options. The package is for variable importance analysis, and is meant primarily for analyzing data from exploratory epidemiological studies, though it could certainly be applied in other areas as well. The approach taken to variable importance comes from the causal inference field, and is different from approaches taken in other R packages. By default, multiPIM uses a double robust targeted maximum likelihood estimator (TMLE) of a parameter akin to the attributable risk. Several regression methods/machine learning algorithms are available for estimating the nuisance parameters of the models, including super learner, a meta-learner which combines several different algorithms into one. We describe a simulation in which the double robust TMLE is compared to the graphical computation estimator. We also provide example analyses using two data sets which are included with the package.
Epidemiology | Numerical Analysis and Computation | Statistical Methodology | Statistical Theory
Ritter, Stephan J.; Jewell, Nicholas P.; and Hubbard, Alan E., "Variable Importance Analysis with the multiPIM R Package" (July 2011). U.C. Berkeley Division of Biostatistics Working Paper Series. Working Paper 286.