The Department of Biostatistics is one of 5 departments in the School of Public Health and Community Medicine at the University of Washington. Its mission is to serve as a source of expertise and a focus for training and research in the quantitative aspects of public health and medicine, and to promote the use of rigorous quantitative methods in the biomedical and public health sciences.
Our graduate program is regarded as one of the best Biostatistics programs in the world, with over 30 years of teaching and research experience on the UW campus. Faculty interests range over a wide variety of statistical topics, including survival analysis, clinical trials, statistical genetics and correlated data.
The UW Biostatistics Working Paper series includes articles on statistical methods and applications developed by members of the department. In general, articles dated 2000 and later are downloadable from this site.
Submission Guidelines.
Please follow the Policies and Procedures page for submission guidelines.
Papers from 2019
Statistical Inference for Networks of High-Dimensional Point Processes, Xu Wang, Mladen Kolar, and Ali Shojaie
Generalized Matrix Decomposition Regression: Estimation and Inference for Two-way Structured Data, Yue Wang, Ali Shojaie, Tim Randolph, and Jing Ma
Papers from 2018
Robust Inference for the Stepped Wedge Design, James P. Hughes, Patrick J. Heagerty, Fan Xia, and Yuqi Ren
Concentrations of criteria pollutants in the contiguous U.S., 1979 – 2015: Role of model parsimony in integrated empirical geographic regression, Sun-Young Kim, Matthew Bechle, Steve Hankey, Elizabeth (Lianne) A. Sheppard, Adam A. Szpiro, and Julian D. Marshall
Papers from 2017
Predicting Future Years of Life, Health, and Functional Ability: A Healthy Life Calculator for Older Adults, Paula Diehr, Michael Diehr, Alice M. Arnold, Laura Yee, Michelle C. Odden, Calvin H. Hirsch, Stephen Thielke, Bruce Psaty, W Craig Johnson, Jorge Kizer, and Anne B. Newman
Adaptive Non-Inferiority Margins under Observable Non-Constancy, Brett S. Hanscom, Deborah J. Donnell, Brian D. Williamson, and Jim Hughes
Evaluation of multiple interventions using a stepped wedge design, Vivian H. Lyons, Lingyu Li, James Hughes, and Ali Rowhani-Rahbar
Combining Biomarkers by Maximizing the True Positive Rate for a Fixed False Positive Rate, Allison Meisner, Marco Carone, Margaret Pepe, and Kathleen F. Kerr
Biomarker Combinations for Diagnosis and Prognosis in Multicenter Studies: Principles and Methods, Allison Meisner, Chirag R. Parikh, and Kathleen F. Kerr
Developing Biomarker Combinations in Multicenter Studies via Direct Maximization and Penalization, Allison Meisner, Chirag R. Parikh, and Kathleen F. Kerr
Using Multilevel Outcomes to Construct and Select Biomarker Combinations for Single-level Prediction, Allison Meisner, Chirag R. Parikh, and Kathleen F. Kerr
Nonparametric variable importance assessment using machine learning techniques, Brian D. Williamson, Peter B. Gilbert, Noah Simon, and Marco Carone
Papers from 2016
Estimation of long-term area-average PM2.5 concentrations for area-level health analyses, Sun-Young Kim, Casey Olives, Neal Fann, Joel Kaufman, Sverre Vedal, and Lianne Sheppard
Models for HSV shedding must account for two levels of overdispersion, Amalia Magaret
Recommendation to Use Exact P-values in Biomarker Discovery Research, Margaret Sullivan Pepe, Matthew F. Buas, Christopher I. Li, and Garnet L. Anderson
Confidence Intervals for Heritability via Haseman-Elston Regression, Tamar Sofer
A Powerful Statistical Framework for Generalization Testing in GWAS, with Application to the HCHS/SOL, Tamar Sofer, Ruth Heller, Marina Bogomolov, Christy L. Avery, Mariaelisa Graff, Kari E. North, Alex Reiner, Timothy A. Thornton, Kenneth Rice, Yoav Benjamini, Cathy C. Laurie, and Kathleen F. Kerr
Papers from 2015
Historical Prediction Modeling Approach for Estimating Long-Term Concentrations of PM in Cohort Studies Before the 1999 Implementation of Widespread Monitoring, Sun-Young Kim, Casey Olives, Lianne Sheppard, Paul D. Sampson, Timothy V. Larson, and Joel Kaufman
Stochastic Optimization via Forward Slice, Bob A. Salim and Lurdes Y. T. Inoue
Meta-analysis of genome-wide association studies with correlated individuals: application to the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), Tamar Sofer, John R. Shaffer, Misa Graff, Qibin Qi, Adrienne M. Stilp, Stephanie M. Gogarten, Kari E. North, Carmen R. Isasi, Cathy C. Laurie, and Adam A. Szpiro
Papers from 2014
Testing Gene-Environment Interactions in the Presence of Measurement Error, Chongzhi Di, Li Hsu, Charles Kooperberg, Alex Reiner, and Ross Prentice
Change Point Testing in Logistic Regression Models with Interaction Term, Youyi Fong, Chongzhi Di, and Sallie Permar
Efficiently Identifying Failures using Quantitative Tests, Matrix-Pooling and the EM-Algorithm, Brett Hanscom, Susanne May, and Jim Hughes
Personalized Evaluation of Biomarker Value: A Cost-benefit Perspective, Ying Huang and Eric Laber
A Joint Model for Multistate Disease Processes and Random Informative Observation Times, with Applications to Electronic Medical Records Data, Jane M. Lange, Rebecca A. Hubbard, Lurdes Y. T. Inoue, and Vladimir Minin
Nonparametric Identifiability of Finite Mixture Models with Covariates for Estimating Error Rate without a Gold Standard, Zheyu Wang and Xiao-Hua Zhou
Papers from 2013
Characterizing Expected Benefits of Biomarkers in Treatment Selection, Ying Huang, Eric Laber, and Holly Janes
Statistical Methods for Evaluating and Comparing Biomarkers for Patient Treatment Selection, Holly Janes, Marshall D. Brown, Margaret Pepe, and Ying Huang
Net Reclassification Indices for Evaluating Risk Prediction Instruments: A Critical Review, Kathleen F. Kerr, Zheyu Wang, Holly Janes, Robyn McClelland, Bruce M. Psaty, and Margaret S. Pepe
Prediction of fine particulate matter chemical components for the Multi-Ethnic Study of Atherosclerosis cohort: A comparison of two modeling approaches, Sun-Young Kim, Lianne Sheppard, Silas Bergen, Adam A. Szpiro, Paul D. Sampson, Joel Kaufman, and Sverre Vedal
Issues Related to Combining Multiple Speciated PM2.5 Data Sources in Spatio-Temporal Exposure Models for Epidemiology: The NPACT Case Study, Sun-Young Kim, Lianne Sheppard, Timothy V. Larson, Joel Kaufman, and Sverre Vedal
Multi-state Models for Natural History of Disease, Amy Laird, Rebecca A. Hubbard, and Lurdes Y. T. Inoue
An Evaluation of Inferential Procedures for Adaptive Clinical Trial Designs with Pre-specified Rules for Modifying the Sample Size, Greg P. Levin, Sarah C. Emerson, and Scott S. Emerson
The Net Reclassification Index (NRI): a Misleading Measure of Prediction Improvement with Miscalibrated or Overfit Models, Margaret Pepe, Jin Fang, Ziding Feng, Thomas Gerds, and Jorgen Hilden
Net Reclassification Index: a Misleading Measure of Prediction Improvement, Margaret Sullivan Pepe, Holly Janes, Kathleen F. Kerr, and Bruce M. Psaty
Hypothesis Testing for an Extended Cox Model with Time-Varying Coefficients, Takumi Saegusa, Chongzhi Di, and Ying Qing Chen
Asymptotic and Finite Sample Behavior of Net Reclassification Indices, Zheyu Wang
Papers from 2012
A National Model Built with Partial Least Squares and Universal Kriging and Bootstrap-based Measurement Error Correction Techniques: An Application to the Multi-Ethnic Study of Atherosclerosis, Silas Bergen, Lianne Sheppard, Paul D. Sampson, Sun-Young Kim, Mark Richards, Sverre Vedal, Joel Kaufman, and Adam A. Szpiro
Decline in Health for Older Adults: 5-Year Change in 13 Key Measures of Standardized Health, Paula H. Diehr, Stephen M. Thielke, Anne B. Newman, Calvin H. Hirsch, and Russell Tracy
Borrowing Information Across Populations in Estimating Positive and Negative Predictive Values, Ying Huang, Youyi Fong, John Wei, and Ziding Feng
Fitting and Interpreting Continuous-Time Latent Markov Models for Panel Data, Jane M. Lange and Vladimir N. Minin
Methods for Evaluating Prediction Performance of Biomarkers and Tests, Margaret Pepe and Holly Janes
Testing for improvement in prediction model performance, Margaret S. Pepe PhD, Kathleen F. Kerr, Gary M. Longton, and Zheyu Wang
A Regionalized National Universal Kriging Model Using Partial Least Squares Regression for Estimating Annual PM2.5 Concentrations in Epidemiology, Paul D. Sampson, Mark Richards, Adam A. Szpiro, Silas Bergen, Lianne Sheppard, Timothy V. Larson, and Joel Kaufman
Transitions Among Health States Using 12 Measures of Successful Aging: Results from the Cardiovascular Health Study, Stephen Thielke and Paula Diehr
Papers from 2011
When Does Combining Markers Improve Classification Performance and What Are Implications for Practice?, Aasthaa Bansal and Margaret Sullivan Pepe
Doubly Robust Estimates for Binary Longitudinal Data Analysis with Missing Response and Missing Covariates, Baojiang Chen and Xiao-Hua Zhou
The Importance of Statistical Theory in Outlier Detection, Sarah C. Emerson and Scott S. Emerson
Some Observations on the Wilcoxon Rank Sum Test, Scott S. Emerson
Adaptive Clinical Trial Designs with Pre-specified Rules for Modifying the Sample Size: Understanding Efficient Types of Adaptation, Gregory P. Levin, Sarah C. Emerson, and Scott S. Emerson
A Flexible Spatio-Temporal Model for Air Pollution: Allowing for Spatio-Temporal Covariates, Johan Lindstrom, Adam A. Szpiro, Paul D. Sampson, Lianne Sheppard, Assaf Oron, Mark Richards, and Tim Larson
Semiparametric Estimation of the Covariate-Specific ROC Curve in Presence of Ignorable Verification Bias, Danping Liu and Xiao-Hua Zhou
Evaluating Markers for Treatment Selection Based on Survival Time, Xiao Song and Xiao-Hua Zhou
Non-Homogeneous Markov Process Models with Incomplete Observations: Application to a Dementia Disease Study, Xiao-Hua Zhou and Baojiang Chen
BATE Curve in Assessment of Clinical Utility of Predictive Biomarkers, Xiao-Hua Zhou and Yunbei Ma
Papers from 2010
Panel Count Data Regression with Informative Observation Times, Petra Buzkova
Modification and Improvement of Empirical Liklihood for Missing Response Problem, Gary Chan
Modification and Improvement of Empirical Likelihood for Missing Response Problem, Kwun Chuen Gary Chan
Oracle and Multiple Robustness Properties of Survey Calibration Estimator in Missing Response Problem, Kwun Chuen Gary Chan
On Two-Stage Hypothesis Testing Procedures Via Asymptotically Independent Statistics, James Dai, Charles Kooperberg, Michael L. LeBlanc, and Ross Prentice
On two-stage hypothesis testing procedures via asymptotically independent statistics, James Y. Dai, Charles Kooperberg, Michael LeBlanc, and Ross L. Prentice
Robustness of approaches to ROC curve modeling under misspecification of the underlying probability model, Sean Devlin, Elizabeth Thomas, and Scott S. Emerson
Using the Stages of Change Model to Choose an Optimal Health Marketing Target, Paula Diehr, Peggy A. Hannon, Barbara Pizacani, Mark Forehand, Jeffrey Harris, Hendrika Meischke, Susan J. Curry, Diane P. Martin, and Marcia R. Weaver
Multi-state Life Tables, Equilibrium Prevalence, and Baseline Selection Bias, Paula Diehr and David Yanez
Exploring the Benefits of Adaptive Sequential Designs in Time-to-Event Endpoint Settings, Sarah C. Emerson, Kyle Rudser, and Scott S. Emerson
Bio-Creep in Non-Inferiority Clinical Trials, Siobhan P. Everson-Stewart and Scott S. Emerson
Asymptotic Properties of the Sequential Empirical ROC and PPV Curves, Joseph S. Koopmeiners and Ziding Feng
Optimizing Vaccine Allocation at Different Points in Time During an Epidemic, Laura Matrajt and Ira M. Longini Jr.
Nonparametric and Semiparametric Analysis of Current Status Data Subject to Outcome Misclassification, Victor G. Sal y Rosas and James P. Hughes
Estimates of Information Growth in Longitudinal Clinical Trials, Abigail Shoben, Kyle Rudser, and Scott S. Emerson
Model-Robust Regression and a Bayesian `Sandwich' Estimator, Adam A. Szpiro, Kenneth M. Rice, and Thomas Lumley
Efficient Measurement Error Correction with Spatially Misaligned Data, Adam A. Szpiro, Lianne Sheppard, and Thomas Lumley
Papers from 2009
Measures to Summarize and Compare the Predictive Capacity of Markers, Wen Gu and Margaret Pepe
Interval Estimation for the Difference in Paired Areas under the ROC Curves in the Absence of a Gold Standard Test, Hsin-Neng Hsieh, Hsiu-Yuan Su, and Xiao-Hua Zhou
Nonparametric and Semiparametric Estimation of the Three Way Receiver Operating Characteristic Surface, Jialiang Li and Xiao-Hua Zhou
A Semi-Parametric Two-Part Mixed-Effects Heteroscedastic Transformation Model for Correlated Right-Skewed Semi-Continuous Data, Huazhen Lin and Xiao-Hua Zhou
Semiparametric Two-Part Models with Proportionality Constraints: Analysis of the Multi-Ethnic Study of Atherosclerosis (MESA), Anna Liu, Richard Kronmal, Xiao-Hua Zhou, and Shuangge Ma
Robustness of Semiparametric Efficiency in Nearly-Correct Models for Two-Phase Samples, Thomas Lumley
Pooled Nucleic Acid Testing to Identify Antiretroviral Treatment Failure during HIV Infection, Susanne May, Anthony Gamst, Richard Haubrich, Constance Benson, and Davey Smith
Pragmatic Estimation of a Spatio-Temporal Air Quality Model With Irregular Monitoring Data, Paul D. Sampson, Adam A. Szpiro, Lianne Sheppard, Johan Lindström, and Joel D. Kaufman
Evaluating Markers for Treatment Selection Based on Survival Time, Xiao Song and Xiao-Hua Zhou
Multiple Imputation Methods for Treatment Noncompliance and Nonresponse in Randomized Clinical Trials, Leslie Taylor and Xiao-Hua (Andrew) Zhou
Relaxing Latent Ignorability in the ITT Analysis of Randomized Studies with Missing Data and Noncompliance, L Taylor and Xiao-Hua Zhou
Papers from 2008
Multiple imputation of timing of mother-to-child transmission of HIV, Elizabeth Brown and Ying Qing Chen
Using Longitudinal Data to Estimate the Effect of Starting to Exercise on the Health of Sedentary Older Adults, Paula Diehr and Calvin Hirsch
Semiparametric and nonparametric methods for evaluating risk prediction markers in case-control studies, Ying Huang and Margaret Pepe
Semiparametric methods for evaluating the covariate-specific predictiveness of continuous markers in matched case-control studies, Ying Huang and Margaret S. Pepe
Accommodating Covariates in ROC Analysis, Holly Janes, Gary M. Longton, and Margaret Pepe
Influence of prediction approaches for spatially-dependent air pollution exposure on health effect estimation, Sun-Young Kim, Lianne Sheppard, and Ho Kim
Estimation and Comparison of Receiver Operating Characteristic Curves, Margaret Pepe, Gary M. Longton, and Holly Janes
Trading Bias for Precision: Decision Theory for Intervals and Sets, Kenneth M. Rice, Thomas Lumley, and Adam A. Szpiro
Estimation for Arbitrary Functionals of Survival, Kyle Rudser, Michael L. LeBlanc, and Scott S. Emerson
Predicting Intra-Urban Variation in Air Pollution Concentrations with Complex Spatio-Temporal Interactions, Adam A. Szpiro, Paul D. Sampson, Lianne Sheppard, Thomas Lumley, Sara D. Adar, and Joel Kaufman
Accounting for Errors from Predicting Exposures in Environmental Epidemiology and Environmental Statistics, Adam A. Szpiro, Lianne Sheppard, and Thomas Lumley
Semiparametric Inferential Procedures for Comparing Multivariate ROC Curves with Interaction Terms, Liansheng Tang and Xiao-Hua Zhou
Synthesis Analysis of Regression Models with a Continuous Outcome, Andrew Zhou, Nan Hu, Guizhou Hu, and Martin Root
Semi-Parametric Maximum Likelihood Estimates for ROC Curves of Continuous-Scale Tests, Xiao-Hua Zhou and Huazhen Lin
Nonparametric Heteroscedastic Transformation Regression Models for Skewed Data with an Application to Health Care Costs, Xiao-Hua Zhou, Huazhen Lin, and Eric Johnson