Biostatistics creates and applies methods for quantitative research in the health sciences. Our faculty conduct research across the spectrum of statistical science from foundations of inference to the discovery of new methodology to health applications. Our designs and analytic methods enable health scientists and professionals in academia, government, pharmaceutical companies, medical research organizations and elsewhere to efficiently acquire knowledge and draw valid conclusions from their ever-expanding sources of information.

A collection of working papers and related research documents from the department faculty may be found here.

Further information about the department may be found at www.biostat.jhsph.edu.

Follow

Papers from 2008

PDF

A Method for Visualizing Multivariate Time Series Data, Roger D. Peng

PDF

Caching and Distributing Statistical Analyses in R, Roger D. Peng

PDF

Spatial Misalignment in time series studies of air pollution and health data, Roger D. Peng and Michelle L. Bell

PDF

ANALYSIS OF SUBGROUP EFFECTS IN RANDOMIZED TRIALS WHEN SUBGROUP MEMBERSHIP IS INFORMATIVELY MISSING: APPLICATION TO THE MADIT II STUDY, Daniel O. Scharfstein, Georgiana Onicescu, and Steven Goodman

PDF

ON THE MERITS OF VOXEL-BASED MORPHOMETRIC PATH-ANALYSIS FOR INVESTIGATING VOLUMETRIC MEDIATION OF A TOXICANT'S INFLUENCE ON COGNITIVE FUNCTION, Shu-chih Su, Brian S. Caffo, Lynn E. Eberly, Elizabeth Garrett-Mayer, Walter F. Stewart, Sining Chen, David Yousem, Christos Davatzikos, and Brian Schwartz

PDF

A BAYESIAN APPROACH TO EFFECT ESTIMATION ACCOUNTING FOR ADJUSTMENT UNCERTAINTY, Chi Wang, Giovanni Parmigiani, Ciprian Crainiceanu, and Francesca Dominici

PDF

Estimating the Causal Effect of Lower Tidal Volume Ventilation on Survival in Patients with Acute Lung Injury, Weiwei Wang, Daniel Scharfstein, Roy Brower, and Dale Needham

PDF

Causal Inference in Observational Studies with Outcome-Dependent Sampling, Weiwei Wang, Daniel Scharfstein, Zhiqiang Tan, and Ellen J. MacKenzie

PDF

STATISTICAL METHODS FOR AUTOMATED DRUG SUSCEPTIBILITY TESTING: BAYESIAN MINIMUM INHIBITORY CONCENTRATION PREDICTION FROM GROWTH CURVES, Xi Zhou, Merlise A. Clyde, James Garrett, Viridiana Lourdes, Michael O'Connell, Giovanni Parmigiani, David J. Turner, and Tim Wiles

Papers from 2007

PDF

A BAYESIAN HIERARCHICAL FRAMEWORK FOR SPATIAL MODELING OF fMRI DATA, F. DuBois Bowman, Brian S. Caffo, Susan Spear Bassett, and Clinton Kilts

PDF

FORECASTING THE GLOBAL BURDEN OF ALZHEIMER'S DISEASE, Ron Brookmeyer, Elizabeth Johnson, Kathryn Ziegler-Graham, and H. Michael Arrighi

PDF

IS MRI-BASED VOLUME A MEDIATOR OF THE ASSOCIATION OF CUMULATIVE LEAD DOSE WITH COGNITIVE FUNCTION?, Brian S. Caffo, Sining Chen, Walter Stewart, Karen Bolla, David Yousem, Christos Davatzikos, and Brian S. Schwartz

PDF

A CASE STUDY IN PHARMACOLOGIC IMAGING USING PRINCIPAL CURVES IN SINGLE PHOTON EMISSION COMPUTED TOMOGRAPHY, Brian S. Caffo, Ciprian M. Crainiceanu, Lijuan Deng, and Craig W. Hendrix

PDF

A SURVEY OF THE LIKELIHOOD APPROACH TO BIOEQUIVALENCE TRIALS, Leena Choi, Brian S. Caffo, and Charles Rohde

PDF

RANDOM EFFECTS MODELS IN A META-ANALYSIS OF THE ACCURACY OF DIAGNOSTIC TESTS WITHIN A GOLD STANDARD IN THE PRESENCE OF MISSING DATA, Haitao Chu, Sining Chen, and Thomas A. Louis

PDF

The Integrative Correlation Coefficient: a Measure of Cross-study Reproducibility for Gene Expressionea Array Data, Leslie M. Cope, Liz Garrett-Mayer, Edward Gabrielson, and Giovanni Parmigiani

PDF

Bayesian Analysis for Penalized Spline Regression Using Win BUGS, Ciprian M. Crainiceanu, David Ruppert, and M.P. Wand

PDF

IDENTIFYING EFFECT MODIFIERS IN AIR POLLUTION TIME-SERIES STUDIES USING A TWO-STAGE ANALYSIS, Sandrah P. Eckel and Thomas A. Louis

PDF

ASSESSING THE UNRELIABILITY OF THE MEDICAL LITERATURE: A RESPONSE TO "WHY MOST PUBLISHED RESEARCH FINDINGS ARE FALSE", Steven Goodman and Sander Greenland

PDF

MULTIPLE MODEL EVALUATION ABSENT THE GOLD STANDARD VIA MODEL COMBINATION, Edwin J. Iversen, Jr.; Giovanni Parmigiani; and Sining Chen

PDF

TRENDS IN PARTICULATE MATTER AND MORTALITY: AN APPROACH TO THE ASSESSMENT OF UNMEASURED CONFOUNDING, Holly Janes, Francesca Dominici, and Scott Zeger

PDF

MULTIPLE DISEASES IN CARRIER PROBABILITY ESTIMATION: ACCOUNTING FOR SURVIVING ALL CANCERS OTHER THAN BREAST AND OVARY IN BRCAPRO, Hormuzd A. Katki, Amanda Blackford, Sining Chen, and Giovanni Parmigiani

PDF

FAST ADAPTIVE PENALIZED SPLINES, Tatyana Krivobokova, Ciprian M. Crainiceanu, and Goran Kauermann

PDF

EFFECTIVE COMMUNICATION OF STANDARD ERRORS AND CONFIDENCE INTERVALS, Thomas A. Louis and Scott L. Zeger

PDF

DECOMPOSITION OF REGRESSION ESTIMATORS TO EXPLORE THE INFLUENCE OF "UNMEASURED" TIME-VARYING CONFOUNDERS, Yun Lu and Scott L. Zeger

PDF

OPTIMAL PROPENSITY SCORE STRATIFICATION, Jessica A. Myers and Thomas A. Louis

PDF

TRAB: TESTING WHETHER MUTATION FREQUENCIES ARE ABOVE AN UNKNOWN BACKGROUND, Giovanni Parmigiani, Sining Chen, and Victor E. Velculescu

PDF

STATISTICAL METHODS FOR THE ANALYSIS OF CANCER GENOME SEQUENCING DATA, Giovanni Parmigiani, J. Lin, Simina Boca, T. Sjoblom, K.W. Kinzler, V.E. Velculescu, and B. Vogelstein

PDF

A REPRODUCIBLE RESEARCH TOOLKIT FOR R, Roger Peng

PDF

A BAYESIAN HIERARCHICAL MODEL FOR CONSTRAINED DISTRIBUTED LAG FUNCTIONS: ESTIMATING THE TIME COURSE OF HOSPITALIZATION ASSOCIATED WITH AIR POLLUTION EXPOSURE, Roger Peng, Francesca Dominici, and Leah J. Welty

PDF

DISTRIBUTED REPRODUCIBLE RESEARCH USING CACHED COMPUTATIONS, Roger Peng and Sandrah P. Eckel

PDF

SEMIPARAMETRIC BIVARIATE QUANTILE-QUANTILE REGRESSION FOR ANALYZING SEMI-COMPETING RISKS DATA, Daniel O. Scharfstein, James M. Robins, and Mark van der Laan

PDF

A HIDDEN MARKOV MODEL FOR JOINT ESTIMATION OF GENOTYPE AND COPY NUMBER IN HIGH-THROUGHPUT SNP CHIPS, Robert B. Scharpf, Giovanni Parmigiani, Jonathan Pevnser, and Ingo Ruczinski

PDF

A BAYESIAN MODEL FOR CROSS-STUDY DIFFERENTIAL GENE EXPRESSION, Robert B. Scharpf, Hakon Tjelemeland, Giovanni Parmigiani, and Andrew B. Nobel

PDF

INFERENCE FOR SURVIVAL CURVES WITH INFORMATIVELY COARSENED DISCRETE EVENT-TIME DATA: APPLICATION TO ALIVE, Michelle Shardell, Daniel O. Scharfstein, David Vlahov, and Noya Galai

PDF

MODIFIED TEST STATISTICS BY INTER-VOXEL VARIANCE SHRINKAGE WITH AN APPLICATION TO fMRI, Shu-chih Su, Brian Caffo, Elizabeth Garrett-Mayer, and Susan Bassett

PDF

MORTALITY IN THE MEDICARE POPULATION AND CHRONIC EXPOSURE TO FINE PARTICULATE AIR POLLUTION , Scott L. Zeger, Francesca Dominici, Aidan McDermott, and Jonathan M. Samet

PDF

OPTIMIZED CROSS-STUDY ANALYSIS OF MICROARRAY-BASED PREDICTORS, Xiaogang Zhong, Luigi Marchionni, Leslie Cope, Edwin S. Iversen, Elizabeth S. Garrett-Mayer, Edward Gabrielson, and Giovanni Parmigiani

PDF

A SMOOTHING APPROACH TO DATA MASKING, Yijie Zhous, Francesca Dominici, and Thomas A. Louis

PDF

RACIAL DISPARITIES IN MORTALITY RISKS IN A SAMPLE OF THE U.S. MEDICARE POPULATION, Yijie Zhou, Francesca Dominici, and Thomas A. Louis

Papers from 2006

PDF

USE OF HIDDEN MARKOV MODELS FOR QTL MAPPING, Karl W. Broman

PDF

A FLEXIBLE GENERAL CLASS OF MARGINAL AND CONDITIONAL RANDOM INTERCEPT MODELS FOR BINARY OUTCOMES USING MIXTURES OF NORMALS, Brian Caffo, Ming-Wen An, and Charles A. Rohde

PDF

EXPLORATION, NORMALIZATION, AND GENOTYPE CALLS OF HIGH DENSITY OLIGONUCLEOTIDE SNP ARRAY DATA, Benilton Carvalho, Terence P. Speed, and Rafael A. Irizarry

PDF

BIVARIATE BINOMIAL SPATIAL MODELLING LOA loa PREVALENCE IN TROPICAL AFRICA, Ciprian M. Crainiceanu, Peter J. Diggle, and Barry Rowlingson

PDF

Adjustment Uncertainty in Effect Estimation, Ciprian M. Crainiceanu, Francesca Dominici, and Giovanni Parmigiani

PDF

COX MODELS WITH NONLINEAR EFFECT OF COVARIATES MEASURED WITH ERROR: A CASE STUDY OF CHRONIC KIDNEY DISEASE INCIDENCE, Ciprian M. Crainiceanu, David Ruppert, and Josef Coresh

PDF

PENALIZED LIKELIHOOD AND BAYESIAN METHODS FOR SPARSE CONTINGENCY TABLES: AN ANALYSIS OF ALTERNATIVE SPLICING IN FULL-LENGTH cDNA LIBRARIES, Corinne Dahinden, Giovanni Parmigiani, Mark C. Emerick, and Peter Buhlmann

PDF

INTERACTING WITH LOCAL AND REMOTE DATA RESPOSITORIES USING THE stashR PACKAGE, Sandrah P. Eckel and Roger Peng

PDF

A Comparative Analysis of the Chronic Effects of Fine Particulate Matter, Sorina E. Eftim, Holly Janes, Aidan McDermott, Jonathan M. Samet, and Francesca Dominici

PDF

INVESTIGATING MEDIATION WHEN COUNTERFACTUALS ARE NOT METAPHYSICAL: DOES SUNLIGHT UVB EXPOSURE MEDIATE THE EFFECT OF EYEGLASSES ON CATARACTS?, Brian Egleston, Daniel O. Scharfstein, Beatriz Munoz, and Sheila West

PDF

MULTIVARIATE ANALYSIS AND VISUALIZATION OF SPLICING CORRELATIONS IN SINGLE-GENE TRANSCRIPTOMES, Mark C. Emerick, Giovanni Parmigiani, and William S. Agnew

PDF

PRINCIPAL STRATIFICATION DESIGNS TO ESTIMATE INPUT DATA MISSING DUE TO DEATH, Constantine E. Frangakis, Donald B. Rubin, Ming-Wen An, and Ellen MacKenzie

PDF

FEATURE-LEVEL EXPLORATION OF THE CHOE ET AL. AFFYMETRIX GENECHIP CONTROL DATASET, Rafael A. Irizarry, Leslie Cope, and Zhijin Wu

PDF

ON THE POTENTIAL FOR ILL-LOGIC WITH LOGICALLY DEFINED OUTCOMES, Xianbin Li, Brian S. Caffo, and Daniel O. Scharfstein

PDF

RECURRENT EVENT MODELS IN THE PRESENCE OF A TERMINAL EVENT: COMPARISON, INFERENCE AND DATA ANALYSIS, Xianghua Luo and Mei-Cheng Wang

PDF

ON THE EQUIVALENCE OF CASE-CROSSOVER AND TIME SERIES METHODS IN ENVIRONMENTAL EPIDEMIOLOGY, Yun Lu and Scott L. Zeger

PDF

POOR PERFORMANCE OF BOOTSTRAP CONFIDENCE INTERVALS FOR THE LOCATION OF A QUANTITATIVE TRAIT LOUCS, Ani Manichaikul, Josee Dupuis, Saunak Sen, and Karl W. Broman

PDF

FDR and Bayesian Multiple Comparisons Rules, Peter Muller, Giovanni Parmigiani, and Kenneth Rice

PDF

INTERACTING WITH DATA USING THE FILEHASH PACKAGE FOR R, Roger Peng

PDF

GAMMA SHAPE MIXTURES FOR HEAVY-TAILED DISTRIBUTIONS, Sergio Venturini, Francesca Dominici, and Giovanni Parmigiani

PDF

ESTIMATING GENOME-WIDE COPY NUMBER USING ALLELE SPECIFIC MIXTURE MODELS, Wenyi Wang , Benilton Caravalho, Nate Miller, Jonathan Pevsner, Aravinda Chakravarti, and Rafael A. Irizarry

Papers from 2005

PDF

NONPARAMETRIC ESTIMATION OF BIVARIATE FAILURE TIME ASSOCIATIONS IN THE PRESENCE OF A COMPETING RISK, Karen Bandeen-Roche and Jing Ning

PDF

A User-Friendly Introduction to Link-Probit-Normal Models, Brian S. Caffo and Michael Griswold

PDF

Additive Hazards Models with Latent Treatment Effectiveness Lag Time, Ying Qing Chen, Charles A. Rohde, and Mei-Cheng Wang

PDF

A Mechanistic Latent Variable Model for Estimating Drug Concentrations in the Male Genital Tract, Leena Choi, Brian Caffo, Charles A. Rohde, Themba T. Ndovi, and Craig W. Hendrix

PDF

Analysis of Affymetrix GeneChip Data Using Amplified RNA, Leslie Cope, Scott M. Hartman, Hinrich W.H. Gohlmann, Jay P. Tiesman, and Rafael A. Irizarry

PDF

ON THE USE OF NON-EUCLIDEAN ISOTROPY IN GEOSTATISTICS, Frank C. Curriero

PDF

Searching for Differentially Expressed Gene Combinations, Marcel Dettling, Edward Gabrielson, and Giovanni Parmigiani

PDF

A Partial Likelihood for Spatio-temporal Point Processes, Peter J. Diggle

PDF

Spatio-temporal Point Processes: Methods and Applications, Peter J. Diggle

PDF

Does the Effect of Micronutrient Supplementation on Neonatal Survival Vary with Respect to the Percentiles of the Birth Weight Distribution?, Francesca Dominici, Scott L. Zeger, Giovanni Parmigiani, Joanne Katz, and Parul Christian

PDF

THE ROLE OF AN EXPLICIT CAUSAL FRAMEWORK IN AFFECTED SIB PAIR DESIGNS WITH COVARIATES , Constantine E. Frangakis, Fan Li, and Betty Q. Doan

PDF

Understanding the Continual Reassessment Method for Dose Finding Studies: An Overview for Non-Statisticians, Elizabeth Garrett-Mayer

PDF

MODELING DIFFERENTIATED TREATMENT EFFECTS FOR MULTIPLE OUTCOMES DATA, Hongfei Guo and Karen Bandeen-Roche

PDF

Analyzing Panel Count Data with Informative Observation Times, Chiung-Yu Huang, Mei-Cheng Wang, and Ying Zhang

PDF

Comparison of Affymetrix GeneChip Expression Measures, Rafael A. Irizarry, Zhijin Wu, and Harris A. Jaffee

PDF

Fixed-Width Output Analysis for Markov Chain Monte Carlo, Galin L. Jones, Murali Haran, Brian S. Caffo, and Ronald Neath

PDF

Designs in Partially Controlled Studies: Messages from a Review, Fan Li and Constantine E. Frangakis

PDF

Polydesigns and Causal Inference, Fan Li and Constantine E. Frangakis

PDF

Model Choice in Time Series Studies of Air Pollution and Mortality, Roger D. Peng, Francesca Dominici, and Thomas A. Louis

PDF

When Should One Substract Background Fluorescence in Two Color Microarrays?, Robert B. Scharpf, Christine A. Iacobuzio-Donahue, Julie B. Sneddon, and Giovanni Parmigiani

PDF

Estimation and Projection of Indicence and Prevalence Based on Doubly Truncated Data with Application to Pharmacoepidemiological Databases, Henrik Stovring and Mei-Cheng Wang

PDF

A Statistical Framework for the Analysis of Microarray Probe-Level Data, Zhijin Wu and Rafael A. Irizarry

Papers from 2004

PDF

Quantitative Methods for Tracking Cognitive Change 3 Years After Coronary Artery Bypass Surgery, Sarah Barry; Scott L. Zeger; Ola A. Selnes; Maura A. Grega; Louis M. Borowicz, Jr.; and Guy M. McKhann

PDF

Ozone and Mortality: A Meta-Analysis of Time-Series Studies and Comparison to a Multi-City Study (The National Morbidity, Mortality, and Air Pollution Study), Michelle L. Bell, Jonathan M. Samet, and Francesca Dominici

PDF

The Genomes of Recombinant Inbred Lines: The Gory Details, Karl W. Broman

PDF

A Hypothesis Test for the End of a Common Source Outbreak, Ron Brookmeyer and Xiaojun You

PDF

BayesMendel: An R Environment for Mendelian Risk Prediction, Sining Chen, Wenyi Wang, Karl Broman, Hormuzd A. Katki, and Giovanni Parmigiani

PDF

Accuracy of MSI Testing in Predicting Germline Mutations of MSH2 and MLH1: A Case Study in Bayesian Meta-Analysis of Diagnostic Tests Without a Gold Standard, Sining Chen, Patrice Watson, and Giovanni Parmigiani

PDF

Power and Robustness of Linkage Tests for Quantitative Traits in General Pedigrees, Weimin Chen, Karl Broman, and Kung-Yee Liang

PDF

Optimal Sampling Times in Bioequivalence Studies Using a Simulated Annealing Algorithm , Leena Choi, Brian Caffo, and Charles Rohde

PDF

MergeMaid: R Tools for Merging and Cross-Study Validation of Gene Expression Data, Leslie Cope, Xiaogang Zhong, Elizabeth S. Garrett-Mayer, and Giovanni Parmigiani

PDF

Spatially Adaptive Bayesian P-Splines with Heteroscedastic Errors, Ciprian M. Crainiceanu, David Ruppert, and Raymond J. Carroll

PDF

Bayesian Geostatistical Design, Peter J. Diggle and Soren Lophaven

PDF

Point Process Methodology for On-line Spatio-temporal Disease Surveillance, Peter J. Diggle, Barry Rowlingson, and Ting-li Su

PDF

Estimating Percentile-Specific Causal Effects: A Case Study of Micronutrient Supplementation, Birth Weight, and Infant Mortality, Francesca Dominici, Scott L. Zeger, Giovanni Parmigiani, Joanne Katz, and Parul Christian

PDF

The Proportional Odds Model for Assessing Rater Agreement with Multiple Modalities, Elizabeth Garrett-Mayer, Steven N. Goodman, and Ralph H. Hruban

PDF

Clustering and Classification Methods for Gene Expression Data Analysis, Elizabeth Garrett-Mayer and Giovanni Parmigiani

PDF

Cross-study Validation and Combined Analysis of Gene Expression Microarray Data, Elizabeth Garrett-Mayer, Giovanni Parmigiani, Xiaogang Zhong, Leslie Cope, and Edward Gabrielson

PDF

Semiparametric Regression in Capture-Recapture Modelling, O. Gimenez, C. Barbraud, Ciprian M. Crainiceanu, S. Jenouvrier, and B.T. Morgan