Located on the Harvard Medical Campus, the Department of Biostatistics was one of the first departments in the newly formed Harvard School of Public Health in 1922. Now in its 80th year, the Department comprises 85 students, 57 faculty members, and 22 research associates and fellows. Our size contributes to our ability to address a broad spectrum of biostatistical and public health issues.
Current departmental research on statistical and computing methods for observational studies and clinical trials includes survival analysis, missing-data problems, and causal inference. Other areas of investigation are environmental research (methods for longitudinal studies, analyses with incomplete data, and meta-analysis); statistical aspects of the study of AIDS and cancer; quantitative problems in health-risk analysis, technology assessment, and clinical decision making; statistical methodology in psychiatric research and in genetic studies; Bayesian statistics; statistical computing; statistical genetics and computational biology; and collaborative research activities with biomedical scientists in other Harvard-affiliated institutions.
The Harvard University Biostatistics Working Paper Series presents contributions by our faculty and researchers that rely on the theory and application of statistical science to analyze public health problems.
Papers from 2013
Model Averaged Double Robust Estimation, Matthew Cefalu, Francesca Dominici, and Giovanni Parmigiani
Efficient Estimation of Risk Ratios From Clustered Binary Data, Matthew Cefalu and Eric Tchetgen Tchetgen
A General Regression Framework for a Secondary Outcome in Case-control Studies, Eric J. Tchetgen Tchetgen
On the Restricted Mean Event Time in Survival Analysis, Lu Tian, Lihui Zhao, and L. J. Wei
A versatile test for equality of two survival functions based on weighted differences of Kaplan-Meier curves, Hajime Uno, Lu Tian, Brian Claggett, and L. J. Wei
Más-o-menos: A Simple Sign Averaging Method for Discrimination in Genomic Data Analysis, Sihai Dave Zhao, Giovanni Parmigiani, Curtis Huttenhower, and Levi Waldron
Papers from 2012
Treatment Selections using Risk-benefit Profiles Based on Data from Comparative Randomized Clinical Trials with Multiple Endpoints, Brian Claggett, Lu Tian, Davide Castagno, and L. J. Wei
Nonparametric Inference for Meta Analysis with Fixed Unknown, Study-specific Parameters, Brian Claggett, Minge Xie, and Lu Tian
C2BAT: A Novel Method for Association Between Ge- netic Markers and Multiple Phenotypes, Melissa Naylor and Christoph Lange
Flexible Covariate-adjusted Exact Tests for Randomized Studies, Alisa J. Stephens, Eric J. Tchetgen Tchetgen, and Victor De Gruttola
Locally Efficient Estimation of Marginal Treatment Effects when Outcomes are Correlated: Is the Prize Worth the Chase?, Alisa J. Stephens, Eric J. Tchetgen Tchetgen, and Victor De Gruttola
Formulae for Causal Mediation Analysis in an Odds Ratio Context Without a Normality Assumption for the Continuous Mediator, Eric J. Tchetgen Tchetgen
Inverse Odds Ratio-Weighted Estimation for Causal Mediation Analysis, Eric J. Tchetgen Tchetgen
Multiple-Robust Estimation of an Odds Ratio Interaction, Eric J. Tchetgen Tchetgen
On a Closed-form Doubly Robust Estimator of the Adjusted Odds Ratio for a Binary Exposure, Eric J. Tchetgen Tchetgen
On a Logistic Mixed Model Formulation of a Quadratic Exponential Model for Correlated Binary Outcomes, Eric J. Tchetgen Tchetgen
A Cautionary Note on Specification of the Correlation Structure in Inverse-Probability-Weighted Estimation for Repeated Measures, Eric J. Tchetgen Tchetgen, M. Maria Glymour, Jennifer Weuve, and James Robins
Robust Estimation of Pure/Natural Direct Effects with Mediator Measurement Error, Eric J. Tchetgen Tchetgen and Sheng Hsuan Lin
On Parametrization, Robustness and Sensitivity Analysis in a Marginal Structural Cox Proportional Hazards Model for Point Exposure, Eric J. Tchetgen Tchetgen and James M. Robins
On Identification of Natural Direct Effects when a Confounder of the Mediator is Directly Affected by Exposure, Eric J. Tchetgen Tchetgen and Tyler J. VanderWeele
Robustness of Measures of Interaction to Unmeasured Confounding, Eric J. Tchetgen Tchetgen and Tyler J. VanderWeele
Papers from 2011
Estimating Subject-Specific Treatment Differences for Risk-Benefit Assessment with Competing Risk Event-Time Data, Brian Claggett, Lihui Zhao, Lu Tian, Davide Castagno, and L. J. Wei
Statistical Properties of the Integrative Correlation Coefficient: a Measure of Cross-study Gene Reproducibility, Leslie Cope and Giovanni Parmigiani
Estimation of Degree Mixing Matrices, With Applications to Network Analysis and HIV Prevention Programs, Ravi Goyal, Joseph Blitzstein, and Victor De Gruttola
Multiple Testing of Local Maxima for Detection of Unimodal Peaks in 1D, Armin Schwartzman, Yulia Gavrilov, and Robert J. Adler
Multiple Testing of Local Maxima for Detection of Peaks in ChIP-Seq Data, Armin Schwartzman, Andrew Jaffe, Yulia Gavrilov, and Clifford A. Meyer
Estimation of Risk Ratios in Cohort Studies With Common Outcomes: A Simple and Efficient Two-stage Approach, Eric J. Tchetgen
On Causal Mediation Analysis with a Survival Outcome, Eric J. Tchetgen Tchetgen
Semiparametric Estimation of Models for Natural Direct and Indirect Effects, Eric J. Tchetgen Tchetgen and Ilya Shpitser
Semiparametric Theory for Causal Mediation Analysis: efficiency bounds, multiple robustness, and sensitivity analysis, Eric J. Tchetgen Tchetgen and Ilya Shpitser
On the Covariate-adjusted Estimation for an Overall Treatment Difference with Data from a Randomized Comparative Clinical Trial, Lu Tian, Tianxi Cai, Lihui Zhao, and L. J. Wei
Bayesian Effect Estimation Accounting for Adjustment Uncertainty, Chi Wang, Giovanni Parmigiani, and Francesca Dominici
Effectively Selecting a Target Population for a Future Comparative Study, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, and L. J. Wei
A Regularization Corrected Score Method for Nonlinear Regression Models with Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, and Donna Spiegelman
Papers from 2010
A New Class of Dantzig Selectors for Censored Linear Regression Models, Yi Li, Lee Dicker, and Sihai Dave Zhao
Estimating Causal Effects in Trials Involving Multi-treatment Arms Subject to Non-compliance: A Bayesian Frame-work, Qi Long, Roderick J. Little, and Xihong Lin
Improving the Power of Chronic Disease Surveillance by Incorporating Residential History, Justin Manjourides and Marcello Pagano
A Perturbation Method for Inference on Regularized Regression Estimates, Jessica Minnier, Lu Tian, and Tianxi Cai
Landmark Prediction of Survival, Layla Parast and Tianxi Cai
Modeling Dependent Gene Expression, Donatello Telesca, Peter Muller, Giovanni Parmigiani, and Ralph S. Freedman
Graphical Procedures for Evaluating Overall and Subject-Specific Incremental Values from New Predictors with Censored Event Time Data, Hajime Uno, Tianxi Cai, Lu Tian, and L. J. Wei
Nonparametric Regression with Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, and Xihong Lin
Powerful SNP Set Analysis for Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, and Xihong Lin
Stratifying Subjects for Treatment Selection with Censored Event Time Data from a Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, and L. J. Wei
Utilizing the Integrated Difference of Two Survival Functions to Quantify the Treatment Contrast for Designing, Monitoring and Analyzing a Comparative Clinical Study, Lihui Zhao, Lu Tian, Hajime Uno, Scott D. Solomon, Marc A. Pfeffer, J. S. Schindler, and L. J. Wei
Principled Sure Independence Screening for Cox Models with Ultra-high-dimensional Covariates, Sihai Dave Zhao and Yi Li
Papers from 2009
Lot Quality Assurance Sampling (LQAS) and the Mozambique Malaria Indicator Surveys, Caitlin Biedron, Marcello Pagano, Bethany L. Hedt, Albert Kilian, Amy Ratcliffe, Samuel Mabunda, and Joseph J. Valadez
Analysis of Randomized Comparative Clinical Trial Data for Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, and L. J. Wei
Spatial Cluster Detection for Repeatedly Measured Outcomes while Accounting for Residential History, Andrea J. Cook, Diane Gold, and Yi Li
Spatial Cluster Detection for Weighted Outcomes Using Cumulative Geographic Residuals, Andrea J. Cook, Yi Li, David Arterburn, and Ram C. Tiwari
Survival Analysis with Error-prone Time-varying Covariates: A Risk Set Calibration Approach, Xiaomei Liao, David M. Zucker, Yi Li, and donna spiegelman
Estimating Subject-Specific Dependent Competing Risk Profile with Censored Event Time Observations, Yi Li, Lu Tian, and L. J. Wei
A New Class of Minimum Power Divergence Estimators with Applications to Cancer Surveillance, Nirian Martin and Yi Li
Marginalized Frailty Models for Multivariate Survival Data, Megan Othus and Yi Li
A Class of Semiparametric Mixture Cure Survival Models with Dependent Censoring, Megan Othus, Yi Li, and Ram C. Tiwari
The Importance of Scale for Spatial-confounding Bias and Precision of Spatial Regression Estimators, Christopher J. Paciorek
Group Comparison of Eigenvalues and Eigenvectors of Diffusion Tensors, Armin Schwartzman, Robert F. Dougherty, and Jonathan E. Taylor
The Effect of Correlation in False Discovery Rate Estimation, Armin Schwartzman and Xihong Lin
On The C-Statistics For Evaluating Overall Adequacy Of Risk Prediction Procedures With Censored Survival Data, Hajime Uno, Tianxi Cai, Michael J. Pencina, Ralph B. D'Agostino, and L. J. Wei
Comparing Risk Scoring Systems Beyond the ROC Paradigm in Survival Analysis, Hajime Uno, Lu Tian, Tianxi Cai, Isaac S. Kohane, and L. J. Wei
Sparse Linear Discriminant Analysis for Simultaneous Testing for the Significance of a Gene Set/Pathway and Gene Selection, Michael C. Wu, Lingson Zhang, Zhaoxi Wang, David C. Christiani, and Xihong Lin
Papers from 2008
Evaluating Subject-level Incremental Values of New Markers for Risk Classification Rule, Tianxi Cai, Lu Tian, Donald M. Lloyd-Jones, and L. J. Wei
Calibrating Parametric Subject-specific Risk Estimation, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, and L. J. Wei
A Functional Random Effects Model for Flexible Assessment of Susceptibility in Longitudinal Designs, Brent A. Coull
Estimation of Controlled Direct Effects, Sylvie Goetgeluk, Stijn Vansteelandt, and Els Goetghebeur
A New Class of Rank Tests for Interval-censored Data, Guadalupe Gomez and Ramon Oller Pique
Measurement Error Caused by Spatial Misalignment in Environmental Epidemiology, Alexandros Gryparis, Christopher J. Paciorek, Ariana Zeka, Joel Schwartz, and Brent A. Coull
A Matrix Pooling Algorithm for Disease Detection, Bethany L. Hedt and Marcello Pagano
Matrix Pooling: An Accurate and Cost Effective Testing Algorithm for Detection of Acute HIV Infection, Bethany L. Hedt and Marcello Pagano
Model-based Clustering of Methylation Array Data: A Recursive-partitioning Algorithm for High-dimensional Data Arising as a Mixture of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, and Karl T. Kelsey
A Powerful and Flexible Multilocus Association Test for Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, and Michael P. Epstein
A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized Clinical Trials Subject to Noncompliance, Rod Little, Qi Long, and Xihong Lin
Estimation and Testing for the Effect of a Genetic Pathway on a Disease Outcome Using Logistic Kernel Machine Regression via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, and Xihong Lin
Semiparametric Maximum Likelihood Estimation in Normal Transformation Models for Bivariate Survival Data, Yi Li, Ross L. Prentice, and Xihong Lin
Limitations of Remotely-sensed Aerosol as a Spatial Proxy for Fine Particulate Matter, Christopher J. Paciorek and Yang Liu
Expanded Technical Report: Mapping Ancient Forests: Bayesian Inference for Spatio-temporal Trends in Forest Composition Using the Fossil Pollen Proxy Record, Christopher J. Paciorek and Jason S. McLachlan
Practical Large-Scale Spatio-Temporal Modeling of Particulate Matter Concentrations, Christopher J. Paciorek, Jeff D. Yanosky, Robin C. Puett, Francine Laden, and Helen H. Suh
Estimation in Semiparametric Transition Measurement Error Models for Longitudinal Data, Wenqin Pan, Donglin Zeng, and Xihong Lin
Empirical Null and False Discovery Rate Inference for Exponential Families, Armin Schwartzman
The Highest Confidence Density Region and Its Usage for Inferences about the Survival Function with Censored Data, Lu Tian, Rui wang, Tianxi Cai, and L. J. Wei
Marginal Structural Models for Partial Exposure Regimes, Stijn Vansteelandt, Karl Mertens, Carl Suetens, and Els Goetghebeur
Nonparametric Inference Procedure For Percentiles of the Random Effect Distribution in Meta Analysis, Rui Wang, Lu Tian, Tianxi Cai, and L. J. Wei
Nonparametric Regression Using Local Kernel Estimating Equations for Correlated Failure Time Data, Zhangsheng Yu and Xihong Lin
Papers from 2007
Survival Analysis with Large Dimensional Covariates: An Application in Microarray Studies, David A. Engler and Yi Li
Assessment of a CGH-based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, and Rebecca A. Betensky
Comparing Trends in Cancer Rates Across Overlapping Regions, Yi Li and Ram C. Tiwari
Estimating Time-to-Event From Longitudinal Categorical Data Using Random Effects Markov Models: Application to Multiple Sclerosis Progression, Micha Mandel and Rebecca A. Betensky
Simultaneous Confidence Intervals Based on the Percentile Bootstrap Approach, Micha Mandel and Rebecca A. Betensky
Assessing Population Level Genetic Instability via Moving Average, Samuel McDaniel, Rebecca Betensky, and Tianxi Cai
Spatio-temporal Associations Between GOES Aerosol Optical Depth Retrievals and Ground-Level PM2.5, Christopher J. Paciorek, Yang Liu, Hortensia Moreno-Macias, and Shobha Kondragunta
Conservative Estimation of Optimal Multiple Testing Procedures, James E. Signorovitch
Effectively Combining Independent 2 x 2 Tables for Valid Inferences in Meta Analysis with all Available Data but no Artificial Continuity Corrections for Studies with Zero Events and its Application to the Analysis of Rosiglitazone's Cardiovascular Disease Related Event Data, Lu Tian, Tianxi Cai, Nikita Piankov, Pierre-Yves Cremieux, and L. J. Wei
Identifying patients who need additional biomarkers for better prediction of health outcome or diagnosis of clinical phenotype, Lu Tian, Tianxi Cai, and L. J. Wei
Correcting Instrumental Variables Estimators for Systematic Measurement Error, Stijn Vansteelandt, Manoochehr Babanezhad, and Els Goetghebeur
Papers from 2006
Regression Analysis for the Partial Area Under the ROC Curve, Tianxi Cai and Lori E. Dodd
Predicting Future Responses Based on Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, and L.J. Wei
Spatial Cluster Detection for Censored Outcome Data, Andrea J. Cook, Diane Gold, and Yi Li
A Computationally Tractable Multivariate Random Effects Model for Clustered Binary Data, Brent A. Coull, E. Andres Houseman, and Rebecca A. Betensky
A Likelihood Based Method for Real Time Estimation of the Serial Interval and Reproductive Number of an Epidemic, Laura Forsberg White and Marcello Pagano
Survival Analysis with Change Point Hazard Functions, Melody S. Goodman, Yi Li, and Ram C. Tiwari
