The Division of Biostatistics is one of five divisions of the School of Public Health at the University of California, Berkeley. Its mission is the promotion of teaching and research of biostatistical methods by faculty and graduate students. Graduate students are admitted to M.A. and Ph.D. programs through the Group in Biostatistics which is a joint program of the School of Public Health and the Department of Statistics.
The Biostatistics Working Paper series includes articles on statistical methods and applications developed by faculty and visitors of the Division of Biostatistics. In general, articles dated 2001 and later are downloadable from this site. For earlier articles that have appeared in print, we have included an abstract with a citation. Articles that are not downloadable or are unavailable in print may be requested from
Nicholas P. Jewell
Chair, Group in Biostatistics
University of California, Berkeley
140 Warren Hall
Berkeley, CA 94720-7360
Papers from 2013
Estimating Effects on Rare Outcomes: Knowledge is Power, Laura B. Balzer and Mark J. van der Laan
Targeted Data Adaptive Estimation of the Causal Dose Response Curve, Iván Díaz and Mark J. van der Laan
An Application Of Machine Learning Methods To The Derivation Of Exposure-Response Curves For Respiratory Outcomes, Ekaterina Eliseeva, Alan E. Hubbard, and Ira B. Tager
Vertically Shifted Mixture Models for Clustering Longitudinal Data by Shape, Brianna C. Heggeseth and Nicholas P. Jewell
Balancing Score Adjusted Targeted Minimum Loss-based Estimation, Samuel D. Lendle, Bruce Fireman, and Mark J. van der Laan
Targeted Maximum Likelihood Estimation for Dynamic and Static Longitudinal Marginal Structural Working Models, Maya L. Petersen, Joshua Schwab, Susan Gruber, Nello Blaser, Michael Schomaker, and Mark J. van der Laan
Subsemble: An Ensemble Method for Combining Subset-Specific Algorithm Fits, Stephanie Sapp, Mark J. van der Laan, and John Canny
Targeted Estimation of Variable Importance Measures with Interval-Censored Outcomes, Stephanie Sapp, Mark J. van der Laan, and Kimberly Page
Papers from 2012
Why Match in Individually and Cluster Randomized Trials?, Laura B. Balzer, Maya L. Petersen, and Mark J. van der Laan
Targeted Learning of The Probability of Success of An In Vitro Fertilization Program Controlling for Time-dependent Confounders, Antoine Chambaz, Sherri Rose, Jean Bouyer, and Mark J. van der Laan
Avoiding Boundary Estimates in Linear Mixed Models Through Weakly Informative Priors, Yeojin Chung, Sophia Rabe-Hesketh, Andrew Gelman, Jingchen Liu, and Vincent Dorie
Optimal Spatial Prediction Using Ensemble Machine Learning, Molly M. Davies and Mark J. van der Laan
Assessing the Causal Effect of Policies: An Approach Based on Stochastic Interventions, Iván Díaz and Mark J. van der Laan
Sensitivity Analysis for Causal Inference Under Unmeasured Confounding and Measurement Error Problems, Iván Díaz and Mark J. van der Laan
The Impact of Covariance Misspecification in Multivariate Gaussian Mixtures on Estimation and Inference: An Application to Longitudinal Modeling, Brianna C. Heggeseth and Nicholas P. Jewell
Computationally Efficient Confidence Intervals for Cross-validated Area Under the ROC Curve Estimates, Erin LeDell, Maya L. Petersen, and Mark J. van der Laan
Targeted Learning for Causality and Statistical Analysis in Medical Research, Sherri Rose, Richard J.C.M. Starmans, and Mark J. van der Laan
Causal Inference for Networks, Mark J. van der Laan
Statistical Inference when using Data Adaptive Estimators of Nuisance Parameters, Mark J. van der Laan
Adaptive Matching in Randomized Trials and Observational Studies, Mark J. van der Laan, Laura Balzer, and Maya L. Petersen
Causal Mediation in a Survival Setting with Time-Dependent Mediators, Wenjing Zheng and Mark J. van der Laan
Papers from 2011
Estimation of a Non-Parametric Variable Importance Measure of a Continuous Exposure, Chambaz Antoine, Pierre Neuvial, and Mark J. van der Laan
Targeted Maximum Likelihood Estimation for Dynamic Treatment Regimes in Sequential Randomized Controlled Trials, Paul Chaffee and Mark J. van der Laan
Targeted Minimum Loss Based Estimation Based on Directly Solving the Efficient Influence Curve Equation, Paul Chaffee and Mark J. van der Laan
Threshold Regression Models Adapted to Case-Control Studies, and the Risk of Lung Cancer Due to Occupational Exposure to Asbestos in France, Antoine Chambaz, Dominique Choudat, Catherine Huber, Jean-Claude Pairon, and Mark J. van der Laan
Estimation and Testing in Targeted Group Sequential Covariate-adjusted Randomized Clinical Trials, Antoine Chambaz and Mark J. van der Laan
Population Intervention Causal Effects Based on Stochastic Interventions, Ivan Diaz Munoz and Mark J. van der Laan
Super Learner Based Conditional Density Estimation with Application to Marginal Structural Models, Ivan Diaz Munoz and Mark J. van der Laan
A Generalized Approach for Testing the Association of a Set of Predictors with an Outcome: A Gene Based Test, Benjamin A. Goldstein, Alan E. Hubbard, and Lisa F. Barcellos
Targeted Minimum Loss Based Estimator that Outperforms a given Estimator, Susan Gruber and Mark J. van der Laan
tmle: An R Package for Targeted Maximum Likelihood Estimation, Susan Gruber and Mark J. van der Laan
Identification and Efficient Estimation of the Natural Direct Effect Among the Untreated, Samuel D. Lendle and Mark J. van der Laan
The Relative Performance of Targeted Maximum Likelihood Estimators, Kristin E. Porter, Susan Gruber, Mark J. van der Laan, and Jasjeet S. Sekhon
GC-Content Normalization for RNA-Seq Data, Davide Risso, Katja Schwartz, Gavin Sherlock, and Sandrine Dudoit
Variable Importance Analysis with the multiPIM R Package, Stephan J. Ritter, Nicholas P. Jewell, and Alan E. Hubbard
A General Implementation of TMLE for Longitudinal Data Applied to Causal Inference in Survival Analysis, Ori M. Stitelman, Victor De Gruttola, and Mark J. van der Laan
Targeted Maximum Likelihood Estimation of Conditional Relative Risk in a Semi-parametric Regression Model, Cathy Tuglus, Kristin E. Porter, and Mark J. van der Laan
Targeted Minimum Loss Based Estimation of an Intervention Specific Mean Outcome, Mark J. van der Laan and Susan Gruber
Targeted Methods for Finding Quantitative Trait Loci, Hui Wang, Sherri Rose, and Mark J. van der Laan
Targeted Maximum Likelihood Estimation of Natural Direct Effect, Wenjing Zheng and Mark J. van der Laan
Papers from 2010
Permutation-based Pathway Testing using the Super Learner Algorithm, Paul Chaffee, Alan E. Hubbard, and Mark L. van der Laan
Targeting The Optimal Design In Randomized Clinical Trials With Binary Outcomes And No Covariate, Antoine Chambaz and Mark J. van der Laan
Targeted Bayesian Learning, Ivan Diaz Munoz, Alan E. Hubbard, and Mark J. van der Laan
A Targeted Maximum Likelihood Estimator of a Causal Effect on a Bounded Continuous Outcome, Susan Gruber and Mark J. van der Laan
Gains in Power from Structured Two-Sample Tests of Means on Graphs, Laurent Jacob, Pierre Neuvial, and Sandrine Dudoit
Observational Study and Individualized Antiretroviral Therapy Initiation Rules for Reducing Cancer Incidence in HIV-Infected Patients, Romain Neugebauer, Michael J. Silverberg, and Mark J. van der Laan
Diagnosing and Responding to Violations in the Positivity Assumption, Maya L. Petersen, Kristin Porter, Susan Gruber, Yue Wang, and Mark J. van der Laan
Super Learner In Prediction, Eric C. Polley and Mark J. van der Laan
Optimizing Randomized Trial Designs to Distinguish which Subpopulations Benefit from Treatment, Michael Rosenblum and Mark J. van der Laan
Simple, Efficient Estimators of Treatment Effects in Randomized Trials Using Generalized Linear Models to Leverage Baseline Variables, Michael Rosenblum and Mark J. van der Laan
Simple Examples of Estimating Causal Effects Using Targeted Maximum Likelihood Estimation, Michael Rosenblum and Mark J. van der Laan
Targeted Maximum Likelihood Estimation of the Parameter of a Marginal Structural Model, Michael Rosenblum and Mark J. van der Laan
The Impact Of Coarsening The Explanatory Variable Of Interest In Making Causal Inferences: Implicit Assumptions Behind Dichotomizing Variables, Ori M. Stitelman, Alan E. Hubbard, and Nicholas P. Jewell
Collaborative Targeted Maximum Likelihood For Time To Event Data, Ori M. Stitelman and Mark J. van der Laan
Targeted Maximum Likelihood Method for Repeated Measures Semiparametric Regression: Discovery for Transcription Factor Activity, Catherine Tuglus and Mark J. van der Laan
Estimation of Causal Effects of Community Based Interventions, Mark J. van der Laan
Targeted Maximum Likelihood Based Causal Inference, Mark J. van der Laan
Asymptotic Theory for Cross-validated Targeted Maximum Likelihood Estimation, Wenjing Zheng and Mark J. van der Laan
Papers from 2009
Evaluation of Statistical Methods for Normalization and Differential Expression in mRNA-Seq Experiments, James H. Bullard, Elizabeth A. Purdom, Kasper D. Hansen, and Sandrine Dudoit
Resampling-Based Multiple Hypothesis Testing with Applications to Genomics: New Developments in the R/Bioconductor Package multtest, Houston N. Gilbert, Katherine S. Pollard, Mark J. van der Laan, and Sandrine Dudoit
Joint Multiple Testing Procedures for Graphical Model Selection with Applications to Biological Networks, Houston N. Gilbert, Mark J. van der Laan, and Sandrine Dudoit
Targeted Maximum Likelihood Estimation: A Gentle Introduction, Susan Gruber and Mark J. van der Laan
Nonparametric population average models: deriving the form of approximate population average models estimated using generalized estimating equations, Alan E. Hubbard and Mark J. van der Laan
Causal Inference in Epidemiological Studies with Strong Confounding, Kelly L. Moore, Romain S. Neugebauer, Mark J. van der Laan, and Ira B. Tager
Application of Time-to-Event Methods in the Assessment of Safety in Clinical Trials, Kelly L. Moore and Mark J. van der Laan
Selecting Optimal Treatments Based on Predictive Factors, Eric C. Polley and Mark J. van der Laan
Causal Inference for Nested Case-Control Studies using Targeted Maximum Likelihood Estimation, Sherri Rose and Mark J. van der Laan
Collaborative Targeted Maximum Likelihood Estimation, Mark J. van der Laan and Susan Gruber
Readings in Targeted Maximum Likelihood Estimation, Mark J. van der Laan, Sherri Rose, and Susan Gruber
A Machine-Learning Algorithm for Estimating and Ranking the Impact of Environmental Risk Factors in Exploratory Epidemiological Studies, Jessica G. Young, Alan E. Hubbard, B Eskenazi, and Nicholas P. Jewell
Papers from 2008
Data-adaptive Selection Of The Adjustment Set In Variable Importance Estimation, Oliver Bembom, Jeffrey W. Fessel, Robert W. Shafer, and Mark J. van der Laan
Data-adaptive selection of the truncation level for Inverse-Probability-of-Treatment-Weighted estimators, Oliver Bembom and Mark J. van der Laan
Supervised Distance Matrices: Theory and Applications to Genomics, Katherine S. POLLARD and Mark J. van der Laan
Confidence Intervals for the Population Mean Tailored to Small Sample Sizes, with Applications to Survey Sampling, Michael Rosenblum and Mark J. van der Laan
Using Regression Models to Analyze Randomized Trials: Asymptotically Valid Hypothesis Tests Despite Incorrectly Specified Models, Michael Rosenblum and Mark J. van der Laan
A Guide to Causal Parameters in Case-Control Designs: Targeted Maximum Likelihood Estimation, Sherri Rose and Mark J. van der Laan
A Note on Risk Prediction for Case-Control Studies, Sherri Rose and Mark J. van der Laan
Why Match? Investigating Matched Case-Control Study Designs with Causal Effect Estimation, Sherri Rose and Mark J. van der Laan
A Small Sample Correction for Estimating Attributable Risk in Case-Control Studies, Daniel B. Rubin
Covariate Adjustment for the Intention-to-Treat Parameter with Empirical Efficiency Maximization, Daniel B. Rubin and Mark J. van der Laan
Doubly Robust Ecological Inference, Daniel B. Rubin and Mark J. van der Laan
Confidence Intervals for Negative Binomial Random Variables of High Dispersion, David Shilane, Alan E. Hubbard, and S N. Evans
FDR Controlling Procedure for Multi-stage Analyses, Catherine Tuglus and Mark J. van der Laan
Targeted Methods for Biomarker Discovery, the Search for a Standard, Catherine Tuglus and Mark J. van der Laan
Estimation Based on Case-Control Designs with Known Incidence Probability, Mark J. van der Laan
The Construction and Analysis of Adaptive Group Sequential Designs, Mark J. van der Laan
Papers from 2007
Biomarker Discovery Using Targeted Maximum Likelihood Estimation: Application to the Treatment of Antiretroviral Resistant HIV Infection, Oliver Bembom, Maya L. Petersen , Soo-Yon Rhee , W. Jeffrey Fessel , Sandra E. Sinisi, Robert W. Shafer, and Mark J. van der Laan
Analyzing Sequentially Randomized Trials Based on Causal Effect Models for Realistic Individualized Treatment Rules, Oliver Bembom and Mark J. van der Laan
Estimating the Effect of Vigorous Physical Activity on Mortality in the Elderly Based on Realistic Individualized Treatment and Intention-to-Treat Rules, Oliver Bembom and Mark J. van der Laan
The Causal Effect of Recent Leisure-Time Physical Activity on All-Cause Mortality Among the Elderly, Oliver Bembom, Mark J. van der Laan, and Ira B. Tager
Resampling-Based Empirical Bayes Multiple Testing Procedures for Controlling Generalized Tail Probability and Expected Value Error Rates: , Sandrine Dudoit, Houston N. Gilbert, and Mark J. van der Laan
Covariate Adjustment in Randomized Trials with Binary Outcomes: Targeted Maximum Likelihood Estimation, Kelly L. Moore and Mark J. van der Laan
Detailed Version: Analyzing Direct Effects in Randomized Trials with Secondary Interventions: An Application to HIV Prevention Trials, Michael A. Rosenblum, Nicholas P. Jewell, Mark J. van der Laan, Stephen Shiboski, Ariane van der Straten, and Nancy Padian
Analyzing Direct Effects in Randomized Trials with Secondary Interventions , Michael Rosenblum, Nicholas P. Jewell, Mark J. van der Laan, Stephen Shiboski, Ariane van der Straten, and Nancy Padian
Empirical Efficiency Maximization, Daniel B. Rubin and Mark J. van der Laan
Loss-Based Estimation with Evolutionary Algorithms and Cross-Validation, David Shilane, Richard H. Liang, and Sandrine Dudoit
Time-Dependent Performance Comparison of Stochastic Optimization Algorithms, David Shilane, Jarno Martikainen, and Seppo Ovaska
Super Learner, Mark J. van der Laan, Eric C. Polley, and Alan E. Hubbard
A Note on Targeted Maximum Likelihood and Right Censored Data, Mark J. van der Laan and Daniel Rubin
Regression Analysis of a Disease Onset Distribution Using Diagnosis Data, Jessica G. Young, Nicholas P. Jewell, and Steven J. Samuels
