Located on the Harvard Medical Campus, the Department of Biostatistics was one of the first departments in the newly formed Harvard School of Public Health in 1922. Now in its 80th year, the Department comprises 85 students, 57 faculty members, and 22 research associates and fellows. Our size contributes to our ability to address a broad spectrum of biostatistical and public health issues.

Current departmental research on statistical and computing methods for observational studies and clinical trials includes survival analysis, missing-data problems, and causal inference. Other areas of investigation are environmental research (methods for longitudinal studies, analyses with incomplete data, and meta-analysis); statistical aspects of the study of AIDS and cancer; quantitative problems in health-risk analysis, technology assessment, and clinical decision making; statistical methodology in psychiatric research and in genetic studies; Bayesian statistics; statistical computing; statistical genetics and computational biology; and collaborative research activities with biomedical scientists in other Harvard-affiliated institutions.

The Harvard University Biostatistics Working Paper Series presents contributions by our faculty and researchers that rely on the theory and application of statistical science to analyze public health problems.

Follow

Papers from 2011

PDF

On Causal Mediation Analysis with a Survival Outcome, Eric J. Tchetgen Tchetgen

PDF

Semiparametric Estimation of Models for Natural Direct and Indirect Effects, Eric J. Tchetgen Tchetgen and Ilya Shpitser

PDF

Semiparametric Theory for Causal Mediation Analysis: efficiency bounds, multiple robustness, and sensitivity analysis, Eric J. Tchetgen Tchetgen and Ilya Shpitser

PDF

On the Covariate-adjusted Estimation for an Overall Treatment Difference with Data from a Randomized Comparative Clinical Trial, Lu Tian, Tianxi Cai, Lihui Zhao, and L. J. Wei

PDF

Bayesian Effect Estimation Accounting for Adjustment Uncertainty, Chi Wang, Giovanni Parmigiani, and Francesca Dominici

PDF

Effectively Selecting a Target Population for a Future Comparative Study, Lihui Zhao, Lu Tian, Tianxi Cai, Brian Claggett, and L. J. Wei

PDF

A Regularization Corrected Score Method for Nonlinear Regression Models with Covariate Error, David M. Zucker, Malka Gorfine, Yi Li, and Donna Spiegelman

Papers from 2010

PDF

A New Class of Dantzig Selectors for Censored Linear Regression Models, Yi Li, Lee Dicker, and Sihai Dave Zhao

PDF

Estimating Causal Effects in Trials Involving Multi-treatment Arms Subject to Non-compliance: A Bayesian Frame-work, Qi Long, Roderick J. Little, and Xihong Lin

PDF

Improving the Power of Chronic Disease Surveillance by Incorporating Residential History, Justin Manjourides and Marcello Pagano

PDF

A Perturbation Method for Inference on Regularized Regression Estimates, Jessica Minnier, Lu Tian, and Tianxi Cai

PDF

Landmark Prediction of Survival, Layla Parast and Tianxi Cai

PDF

Modeling Dependent Gene Expression, Donatello Telesca, Peter Muller, Giovanni Parmigiani, and Ralph S. Freedman

PDF

Graphical Procedures for Evaluating Overall and Subject-Specific Incremental Values from New Predictors with Censored Event Time Data, Hajime Uno, Tianxi Cai, Lu Tian, and L. J. Wei

PDF

Nonparametric Regression with Missing Outcomes Using Weighted Kernel Estimating Equations, Lu Wang, Andrea Rotnitzky, and Xihong Lin

PDF

Powerful SNP Set Analysis for Case-Control Genome Wide Association Studies, Michael C. Wu, Peter Kraft, Michael P. Epstein, Deanne M. Taylor, Stephen J. Chanock, David J. Hunter, and Xihong Lin

PDF

Stratifying Subjects for Treatment Selection with Censored Event Time Data from a Comparative Study, Lihui Zhao, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, and L. J. Wei

PDF

Utilizing the Integrated Difference of Two Survival Functions to Quantify the Treatment Contrast for Designing, Monitoring and Analyzing a Comparative Clinical Study, Lihui Zhao, Lu Tian, Hajime Uno, Scott D. Solomon, Marc A. Pfeffer, J. S. Schindler, and L. J. Wei

PDF

Principled Sure Independence Screening for Cox Models with Ultra-high-dimensional Covariates, Sihai Dave Zhao and Yi Li

Papers from 2009

PDF

Lot Quality Assurance Sampling (LQAS) and the Mozambique Malaria Indicator Surveys, Caitlin Biedron, Marcello Pagano, Bethany L. Hedt, Albert Kilian, Amy Ratcliffe, Samuel Mabunda, and Joseph J. Valadez

PDF

Analysis of Randomized Comparative Clinical Trial Data for Personalized Treatment Selections, Tianxi Cai, Lu Tian, Peggy H. Wong, and L. J. Wei

PDF

Spatial Cluster Detection for Repeatedly Measured Outcomes while Accounting for Residential History, Andrea J. Cook, Diane Gold, and Yi Li

PDF

Spatial Cluster Detection for Weighted Outcomes Using Cumulative Geographic Residuals, Andrea J. Cook, Yi Li, David Arterburn, and Ram C. Tiwari

PDF

Survival Analysis with Error-prone Time-varying Covariates: A Risk Set Calibration Approach, Xiaomei Liao, David M. Zucker, Yi Li, and donna spiegelman

PDF

Estimating Subject-Specific Dependent Competing Risk Profile with Censored Event Time Observations, Yi Li, Lu Tian, and L. J. Wei

PDF

A New Class of Minimum Power Divergence Estimators with Applications to Cancer Surveillance, Nirian Martin and Yi Li

PDF

Marginalized Frailty Models for Multivariate Survival Data, Megan Othus and Yi Li

PDF

A Class of Semiparametric Mixture Cure Survival Models with Dependent Censoring, Megan Othus, Yi Li, and Ram C. Tiwari

PDF

The Importance of Scale for Spatial-confounding Bias and Precision of Spatial Regression Estimators, Christopher J. Paciorek

PDF

Group Comparison of Eigenvalues and Eigenvectors of Diffusion Tensors, Armin Schwartzman, Robert F. Dougherty, and Jonathan E. Taylor

PDF

The Effect of Correlation in False Discovery Rate Estimation, Armin Schwartzman and Xihong Lin

PDF

On The C-Statistics For Evaluating Overall Adequacy Of Risk Prediction Procedures With Censored Survival Data, Hajime Uno, Tianxi Cai, Michael J. Pencina, Ralph B. D'Agostino, and L. J. Wei

PDF

Comparing Risk Scoring Systems Beyond the ROC Paradigm in Survival Analysis, Hajime Uno, Lu Tian, Tianxi Cai, Isaac S. Kohane, and L. J. Wei

PDF

Sparse Linear Discriminant Analysis for Simultaneous Testing for the Significance of a Gene Set/Pathway and Gene Selection, Michael C. Wu, Lingson Zhang, Zhaoxi Wang, David C. Christiani, and Xihong Lin

Papers from 2008

PDF

Evaluating Subject-level Incremental Values of New Markers for Risk Classification Rule, Tianxi Cai, Lu Tian, Donald M. Lloyd-Jones, and L. J. Wei

PDF

Calibrating Parametric Subject-specific Risk Estimation, Tianxi Cai, Lu Tian, Hajime Uno, Scott D. Solomon, and L. J. Wei

PDF

A Functional Random Effects Model for Flexible Assessment of Susceptibility in Longitudinal Designs, Brent A. Coull

PDF

Estimation of Controlled Direct Effects, Sylvie Goetgeluk, Stijn Vansteelandt, and Els Goetghebeur

PDF

A New Class of Rank Tests for Interval-censored Data, Guadalupe Gomez and Ramon Oller Pique

PDF

Measurement Error Caused by Spatial Misalignment in Environmental Epidemiology, Alexandros Gryparis, Christopher J. Paciorek, Ariana Zeka, Joel Schwartz, and Brent A. Coull

PDF

A Matrix Pooling Algorithm for Disease Detection, Bethany L. Hedt and Marcello Pagano

PDF

Matrix Pooling: An Accurate and Cost Effective Testing Algorithm for Detection of Acute HIV Infection, Bethany L. Hedt and Marcello Pagano

PDF

Model-based Clustering of Methylation Array Data: A Recursive-partitioning Algorithm for High-dimensional Data Arising as a Mixture of Beta Distributions, E. Andres Houseman, Brock C. Christensen, Ru-Fang Yeh, Carmen J. Marsit, Margaret R. Karagas, Margaret Wrensch, Heather H. Nelson, Joseph Wiemels, Shichun Zheng, John K. Wiencke, and Karl T. Kelsey

PDF

A Powerful and Flexible Multilocus Association Test for Quantitative Traits, Lydia Coulter Kwee, Dawei Liu, Xihong Lin, Debashis Ghosh, and Michael P. Epstein

PDF

A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized Clinical Trials Subject to Noncompliance, Rod Little, Qi Long, and Xihong Lin

PDF

Estimation and Testing for the Effect of a Genetic Pathway on a Disease Outcome Using Logistic Kernel Machine Regression via Logistic Mixed Models, Dawei Liu, Debashis Ghosh, and Xihong Lin

PDF

Semiparametric Maximum Likelihood Estimation in Normal Transformation Models for Bivariate Survival Data, Yi Li, Ross L. Prentice, and Xihong Lin

PDF

Limitations of Remotely-sensed Aerosol as a Spatial Proxy for Fine Particulate Matter, Christopher J. Paciorek and Yang Liu

PDF

Expanded Technical Report: Mapping Ancient Forests: Bayesian Inference for Spatio-temporal Trends in Forest Composition Using the Fossil Pollen Proxy Record, Christopher J. Paciorek and Jason S. McLachlan

PDF

Practical Large-Scale Spatio-Temporal Modeling of Particulate Matter Concentrations, Christopher J. Paciorek, Jeff D. Yanosky, Robin C. Puett, Francine Laden, and Helen H. Suh

PDF

Estimation in Semiparametric Transition Measurement Error Models for Longitudinal Data, Wenqin Pan, Donglin Zeng, and Xihong Lin

PDF

Empirical Null and False Discovery Rate Inference for Exponential Families, Armin Schwartzman

PDF

The Highest Confidence Density Region and Its Usage for Inferences about the Survival Function with Censored Data, Lu Tian, Rui wang, Tianxi Cai, and L. J. Wei

PDF

Marginal Structural Models for Partial Exposure Regimes, Stijn Vansteelandt, Karl Mertens, Carl Suetens, and Els Goetghebeur

PDF

Nonparametric Inference Procedure For Percentiles of the Random Effect Distribution in Meta Analysis, Rui Wang, Lu Tian, Tianxi Cai, and L. J. Wei

PDF

Nonparametric Regression Using Local Kernel Estimating Equations for Correlated Failure Time Data, Zhangsheng Yu and Xihong Lin

Papers from 2007

PDF

Survival Analysis with Large Dimensional Covariates: An Application in Microarray Studies, David A. Engler and Yi Li

PDF

Assessment of a CGH-based Genetic Instability, David A. Engler, Yiping Shen, J F. Gusella, and Rebecca A. Betensky

PDF

Comparing Trends in Cancer Rates Across Overlapping Regions, Yi Li and Ram C. Tiwari

PDF

Estimating Time-to-Event From Longitudinal Categorical Data Using Random Effects Markov Models: Application to Multiple Sclerosis Progression, Micha Mandel and Rebecca A. Betensky

PDF

Simultaneous Confidence Intervals Based on the Percentile Bootstrap Approach, Micha Mandel and Rebecca A. Betensky

PDF

Assessing Population Level Genetic Instability via Moving Average, Samuel McDaniel, Rebecca Betensky, and Tianxi Cai

PDF

Spatio-temporal Associations Between GOES Aerosol Optical Depth Retrievals and Ground-Level PM2.5, Christopher J. Paciorek, Yang Liu, Hortensia Moreno-Macias, and Shobha Kondragunta

PDF

Conservative Estimation of Optimal Multiple Testing Procedures, James E. Signorovitch

PDF

Effectively Combining Independent 2 x 2 Tables for Valid Inferences in Meta Analysis with all Available Data but no Artificial Continuity Corrections for Studies with Zero Events and its Application to the Analysis of Rosiglitazone's Cardiovascular Disease Related Event Data, Lu Tian, Tianxi Cai, Nikita Piankov, Pierre-Yves Cremieux, and L. J. Wei

PDF

Identifying patients who need additional biomarkers for better prediction of health outcome or diagnosis of clinical phenotype, Lu Tian, Tianxi Cai, and L. J. Wei

PDF

Correcting Instrumental Variables Estimators for Systematic Measurement Error, Stijn Vansteelandt, Manoochehr Babanezhad, and Els Goetghebeur

Papers from 2006

PDF

Regression Analysis for the Partial Area Under the ROC Curve, Tianxi Cai and Lori E. Dodd

PDF

Predicting Future Responses Based on Possibly Misspecified Working Models, Tianxi Cai, Lu Tian, Scott D. Solomon, and L.J. Wei

PDF

Spatial Cluster Detection for Censored Outcome Data, Andrea J. Cook, Diane Gold, and Yi Li

PDF

A Computationally Tractable Multivariate Random Effects Model for Clustered Binary Data, Brent A. Coull, E. Andres Houseman, and Rebecca A. Betensky

PDF

A Likelihood Based Method for Real Time Estimation of the Serial Interval and Reproductive Number of an Epidemic, Laura Forsberg White and Marcello Pagano

PDF

Survival Analysis with Change Point Hazard Functions, Melody S. Goodman, Yi Li, and Ram C. Tiwari

PDF

Semiparametric Latent Variable Regression Models for Spatio-temporal Modeling of Mobile Source Particles in the Greater Boston Area, Alexandros Gryparis, Brent A. Coull, Joel Schwartz, and Helen H. Suh

PDF

Posterior Simulation in the Generalized Linear Model with Semiparmetric Random Effects, Subharup Guha

PDF

Bayesian Hidden Markov Modeling of Array CGH Data, Subharup Guha, Yi Li, and Donna Neuberg

PDF

Spatio-Temporal Analysis of Areal Data and Discovery of Neighborhood Relationships in Conditionally Autoregressive Models, Subharup Guha and Louise Ryan

PDF

PLASQ: A Generalized Linear Model-Based Procedure to Determine Allelic Dosage ini Cancer Cells from SNP Array Data, Thomas LaFramboise, David P. Harrington, and Barbara A. Weir

PDF

A Comparison of Methods for Estimating the Causal Effect of a Treatment in Randomized Clinical Trials Subject to Noncompliance, Rod Little, Qi Long, and Xihong Lin

PDF

Semiparametric Regression of Multi-Dimensional Genetic Pathway Data: Least Squares Kernel Machines and Linear Mixed Models, Dawei Liu, Xihong Lin, and Debashis Ghosh

PDF

Causal Inference in Hybrid Intervention Trials Involving Treatment Choice, Qi Long, Rod Little, and Xihong Lin

PDF

Selecting 'Significant' Differentially Expressed Genes from the Combined Perspective of the Null and the Alternative, Beatrijs Moerkerke and Els Goetghebeur

PDF

An Informative Bayesian Structural Equation Model to Assess Source-Specific Health Effects of Air Pollution, Margaret C. Nikolov, Brent A. Coull, Paul J. Catalano, and John J. Godleski

PDF

Mixed Multiplicative Factor Analysis Model for Air Pollution Exposure Assessment, Margaret C. Nikolov, Brent A. Coull, Paul J. Catalano, and John J. Godleski

PDF

Bayesian Smoothing of Irregularly-spaced Data Using Fourier Basis Functions, Christopher J. Paciorek

PDF

Structural Inference in Transition Measurement Error Models for Longitudinal Data, Wenqin Pan, Xihong Lin, and Donglin Zeng

PDF

Estimation in Semiparametric Transition Measurement Error Models for Longitudinal Data, Wenqin Pan, Donglin Zeng, and Xihong Lin

PDF

Multiple Testing With an Empirical Alternative Hypothesis, James E. Signorovitch

PDF

A Diagnostic Test for the Mixing Distribution in a Generalised Linear Mixed Model, Eric J. Tchetgen and Brent A. Coull

PDF

Evaluating Prediction Rules for t-Year Survivors With Censored Regression Models, Hajime Uno, Tianxi Cai, Lu Tian, and L.J. Wei

PDF

Using Profile Likelihood for Semiparametric Model Selection with Application to Proportional Hazards Mixed Models, Ronghui Xu, Anthony Gamst, Michael Donohue, Florin Vaida, and David P. Harrington

PDF

Nonparametric Regression Using Local Kernel Estimating Equations for Correlated Failure Time Data, Zhangsheng Yu and Xihong Lin

Papers from 2005

PDF

The Sensitivity and Specificity of Markers for Event Times, Tianxi Cai, Margaret S. Pepe, Thomas Lumley, Yingye Zheng, and Nancy Swords Jenny

PDF

Model Checking for ROC Regression Analysis, Tianxi Cai and Yingye Zheng

PDF

A Pseudolikelihood Approach for Simultaneous Analysis of Array Comparative Genomic Hybridizations (aCGH), David A. Engler, Gayatry Mohapatra, David N. Louis, and Rebecca Betensky

PDF

Gauss-Seidel Estimation of Generalized Linear Mixed Models with Application to Poisson Modeling of Spatially Varying Disease Rates, Subharup Guha and Louise Ryan

PDF

Feature-Specific Penalized Latent Class Analysis for Genomic Data, E. Andres Houseman, Brent A. Coull, and Rebecca A. Betensky

PDF

A Nonstationary Negative Binomial Time Series with Time-Dependent Covariates: Enterococcus Counts in Boston Harbor, E. Andres Houseman, Brent Coull, and James P. Shine

PDF

Robust Inferences For Covariate Effects On Survival Time With Censored Linear Regression Models, Larry Leon, Tianxi Cai, and L. J. Wei

PDF

Semiparametric Estimation in General Repeated Measures Problems, Xihong Lin and Raymond J. Carroll