Biostatistics creates and applies methods for quantitative research in the health sciences. Our faculty conduct research across the spectrum of statistical science from foundations of inference to the discovery of new methodology to health applications. Our designs and analytic methods enable health scientists and professionals in academia, government, pharmaceutical companies, medical research organizations and elsewhere to efficiently acquire knowledge and draw valid conclusions from their ever-expanding sources of information.
A collection of working papers and related research documents from the department faculty may be found here.
Further information about the department may be found at www.biostat.jhsph.edu.
Papers from 2005
Model Choice in Time Series Studies of Air Pollution and Mortality, Roger D. Peng, Francesca Dominici, and Thomas A. Louis
When Should One Substract Background Fluorescence in Two Color Microarrays?, Robert B. Scharpf, Christine A. Iacobuzio-Donahue, Julie B. Sneddon, and Giovanni Parmigiani
Estimation and Projection of Indicence and Prevalence Based on Doubly Truncated Data with Application to Pharmacoepidemiological Databases, Henrik Stovring and Mei-Cheng Wang
A Statistical Framework for the Analysis of Microarray Probe-Level Data, Zhijin Wu and Rafael A. Irizarry
Papers from 2004
Quantitative Methods for Tracking Cognitive Change 3 Years After Coronary Artery Bypass Surgery, Sarah Barry; Scott L. Zeger; Ola A. Selnes; Maura A. Grega; Louis M. Borowicz, Jr.; and Guy M. McKhann
Ozone and Mortality: A Meta-Analysis of Time-Series Studies and Comparison to a Multi-City Study (The National Morbidity, Mortality, and Air Pollution Study), Michelle L. Bell, Jonathan M. Samet, and Francesca Dominici
The Genomes of Recombinant Inbred Lines: The Gory Details, Karl W. Broman
A Hypothesis Test for the End of a Common Source Outbreak, Ron Brookmeyer and Xiaojun You
BayesMendel: An R Environment for Mendelian Risk Prediction, Sining Chen, Wenyi Wang, Karl Broman, Hormuzd A. Katki, and Giovanni Parmigiani
Accuracy of MSI Testing in Predicting Germline Mutations of MSH2 and MLH1: A Case Study in Bayesian Meta-Analysis of Diagnostic Tests Without a Gold Standard, Sining Chen, Patrice Watson, and Giovanni Parmigiani
Power and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees, Weimin Chen, Karl Broman, and Kung-Yee Liang
Optimal Sampling Times in Bioequivalence Studies Using a Simulated Annealing Algorithm , Leena Choi, Brian Caffo, and Charles Rohde
MergeMaid: R Tools for Merging and Cross-Study Validation of Gene Expression Data, Leslie Cope, Xiaogang Zhong, Elizabeth S. Garrett-Mayer, and Giovanni Parmigiani
Spatially Adaptive Bayesian P-Splines with Heteroscedastic Errors, Ciprian M. Crainiceanu, David Ruppert, and Raymond J. Carroll
Bayesian Geostatistical Design, Peter J. Diggle and Soren Lophaven
Point Process Methodology for On-line Spatio-temporal Disease Surveillance, Peter J. Diggle, Barry Rowlingson, and Ting-li Su
Estimating Percentile-Specific Causal Effects: A Case Study of Micronutrient Supplementation, Birth Weight, and Infant Mortality, Francesca Dominici, Scott L. Zeger, Giovanni Parmigiani, Joanne Katz, and Parul Christian
The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities, Elizabeth Garrett-Mayer, Steven N. Goodman, and Ralph H. Hruban
Clustering and Classification Methods for Gene Expression Data Analysis, Elizabeth Garrett-Mayer and Giovanni Parmigiani
Cross-study Validation and Combined Analysis of Gene Expression Microarray Data, Elizabeth Garrett-Mayer, Giovanni Parmigiani, Xiaogang Zhong, Leslie Cope, and Edward Gabrielson
Semiparametric Regression in Capture-Recapture Modelling, O. Gimenez, C. Barbraud, Ciprian M. Crainiceanu, S. Jenouvrier, and B.T. Morgan
ON MARGINALIZED MULTILEVEL MODELS AND THEIR COMPUTATION, Michael E. Griswold and Scott L. Zeger
Bayesian Hierarchical Distributed Lag Models for Summer Ozone Exposure and Cardio-Respiratory Mortality, Yi Huang, Francesca Dominici, and Michelle L. Bell
Multiple Lab Comparison of Microarray Platforms, Rafael A. Irizarry et al.
Choosing Smoothness Parameters for Smoothing Splines by Minimizing and Estimate of Risk, Rafael A. Irizarry
Inequity Measures for Evaluations of Environmental Justice: A Case Study of Close Proximity to Highways in NYC, Jerry O. Jacobson, Nicolas W. Hengartner, and Thomas A. Louis
Effect of Misreported Family History on Mendelian Mutation Prediction Models, Hormuzd A. Katki
Ranking USRDS Provider-Specific SMRs from 1998-2001, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, and Greg Ridgeway
Screening for Differentially Expressed Genes: Are Multilevel Models Helpful?, Dongmei Liu, Giovanni Parmigiani, and Brian Caffo
Optimal Sample Size for Multiple Testing: the Case of Gene Expression Microarrays, Peter Muller, Giovanni Parmigiani, Christian Robert, and Judith Rousseau
Seasonal Analyses of Air Pollution and Mortality in 100 U.S. Cities, Roger D. Peng, Francesca Dominici, Roberto Pastor-Barriuso, Scott L. Zeger, and Jonathan M. Samet
The National Morbidity, Mortality, and Air Pollution Study Database in R, Roger D. Peng, Leah J. Welty, and Aidan McDermott
A Hierarchical Multivariate Two-Part Model for Profiling Providers' Effects on Healthcare Charges, John W. Robinson, Scott L. Zeger, and Christopher B. Forrest
Studying Effects of Primary Care Physicians and Patients on the Trade-Off Between Charges for Primary Care and Specialty Care Using a Hierarchical Multivariate Two-Part Model, John W. Robinson, Scott L. Zeger, and Christopher B. Forrest
Self-Reported Memory Symptoms with Coronary Artery Disease: A Prospective of CABG Patients and Nonsurgical Controls, Ola A. Selnes; Maura A. Grega; Louis M. Borowicz, Jr.; Sarah Barry; Scott L. Zeger; and Guy M. McKhann
Squared Extrapolation Methods (SQUAREM): A New Class of Simple and Efficient Numerical Schemes for Accelerating the Convergence of the EM Algorithm, Ravi Varadhan and Ch. Roland
A Model Based Background Adjustment for Oligonucleotide Expression Arrays, Zhijin Wu, Rafael A. Irizarry, Robert Gentleman, Francisco Martinez Murillo, and Forrest Spencer
A Cox Model for Biostatistics of the Future, Scott L. Zeger, Peter J. Diggle, and Kung-Yee Liang
On Time Series Analysis of Public Health and Biomedical Data, Scott L. Zeger, Rafael A. Irizarry, and Roger D. Peng
Papers from 2003
Time-Series Studies of Particulate Matter, Michelle L. Bell, Jonathan M. Samet, and Francesca Dominici
Modeling the Incubation Period of Anthrax, Ron Brookmeyer, Elizabeth Johnson, and Sarah Barry
Unification of Variance Components and Haseman-Elston Regression for Quantitative Trait Linkage Analysis, Wei-Min Chen, Karl W. Broman, and Kung-Yee Liang
Kernel Estimation of Rate Function for Recurrent Event Data, Chin-Tsang Chiang, Mei-Cheng Wang, and Chiung-Yu Huang
Underestimation of Standard Errors in Multi-Site Time Series Studies, Michael Daniels, Francesca Dominici, and Scott L. Zeger
Smooth Quantile Ratio Estimation, Francesca Dominici, Leslie Cope, Daniel Q. Naiman, and Scott L. Zeger
Hierarchical Bivariate Time Series Models: A Combined Analysis of the Effects of Particulate Matter on Morbidity and Mortality, Francesca Dominici, Antonella Zanobetti, Scott L. Zeger, Joel Schwartz, and Jonathan M. Samet
Smooth Quantile Ratio Estimation with Regression: Estimating Medical Expenditures for Smoking Attributable Diseases, Francesca Dominici and Scott L. Zeger
Checking Assumptions in Latent Class Regression Models via a Markov Chain Monte Carlo Estimation Approach: An Application to Depression and Socio-Economic Status, Elizabeth Garrett, Richard Miech, Pamela Owens, William W. Eaton, and Scott L. Zeger
A Nested Unsupervised Approach to Identifying Novel Molecular Subtypes, Elizabeth Garrett and Giovanni Parmigiani
Joint Modeling and Estimation for Recurrent Event Processes and Failure Time Data, Chiung-Yu Huang and Mei-Cheng Wang
Nonparametric Estimation of the Bivariate Recurrence Time Distribution, Chiung-Yu Huang and Mei-Cheng Wang
Loss Function Based Ranking in Two-Stage, Hierarchical Models, Rongheng Lin, Thomas A. Louis, Susan M. Paddock, and Greg Ridgeway
Uncertainty and the Value of Diagnostic Information With Application to Axillary Lymph Node Dissection in Breast Cancer, Giovanni Parmigiani
Cross-Calibration of Stroke Disability Measures: Bayesian Analysis of Longitudinal Ordinal Categorical Data Using Negative Dependence, Giovanni Parmigiani, Heidi W. Ashih, Gregory P. Samsa, Pamela W. Duncan, Sue Min Lai, and David B. Matchar
Optimization of Breast Cancer Screening Modalities, Yu Shen and Giovanni Parmigiani
Stochastic Models Based on Molecular Hybridization Theory for Short Oligonucleotide Microarrays, Zhijin Wu, Richard LeBlanc, and Rafael A. Irizarry
Papers from 2002
Estimating the Number of Essential Genes in a Genome by Random Transposon Mutagenesis, Natalie J. Blades and Karl W. Broman