Biostatistics creates and applies methods for quantitative research in the health sciences. Our faculty conduct research across the spectrum of statistical science from foundations of inference to the discovery of new methodology to health applications. Our designs and analytic methods enable health scientists and professionals in academia, government, pharmaceutical companies, medical research organizations and elsewhere to efficiently acquire knowledge and draw valid conclusions from their ever-expanding sources of information.

A collection of working papers and related research documents from the department faculty may be found here.

Further information about the department may be found at www.biostat.jhsph.edu.

Follow

Papers from 2009

PDF

A MULTILEVEL MODEL TO ADDRESS BATCH EFFECTS IN COPY NUMBER ESTIMATION USING SNP ARRAYS, Robert B. Scharpf, Ingo Ruczinski, Benilton Carvalho, Betty Doan, Aravinda Chakravarti, and Rafael A. Irizarry

PDF

A MULTILEVEL MODEL TO ADDRESS BATCH EFFECTS IN COPY NUMBER USING SNP ARRAYS, Robert B. Scharpf, Ingo Ruczinski, Benilton Carvalho, Betty Doan, Aravinda Chakravarti, and Rafael A. Irizarry

PDF

Estimating effects by combining instrumental variables with case-control designs: the role of principal stratification, Russell T. Shinohara, Constantine E. Frangakis, Elizabeth Platz, and Konstantinos Tsilidis

PDF

LASAGNA PLOTS: A SAUCY ALTERNATIVE TO SPAGHETTI PLOTS, Bruce Swihart, Brian Caffo, Bryan D. James, Matthew Strand, Brian S. Schwartz, and Naresh M. Punjabi

PDF

Modeling multilevel sleep transitional data via Poisson log-linear multilevel models, Bruce J. Swihart, Brian Caffo, Ciprian Crainiceanu, and Naresh M. Punjabi

PDF

A BAYESIAN SHRINKAGE MODEL FOR INCOMPLETE LONGITUDINAL BINARY DATA WITH APPLICATION TO THE BREAST CANCER PREVENTION TRIAL, C. Wang, M.J. Daniels, Daniel O. Scharfstein, and S. Land

PDF

REDEFINING CpG ISLANDS USING A HIDEEN MARKOV MODEL, Hao Wu, Brain Caffo, Harris A. Jaffee, Andrew P. Feinberg, and Rafael A. Irizarry

PDF

Subset Quantile Normalization using Negative Control Features, Zhijin Wu

PDF

Analyzing Bivariate Survival Data with Interval Sampling and Application to Cancer Epidemiology, Hong Zhu and Mei-Cheng Wang

Papers from 2008

PDF

LIKELIHOOD ESTIMATION OF CONJUGACY RELATIONSHIPS IN LINEAR MODELS WITH APPLICATIONS TO HIGH-THROUGHPUT GENOMICS, Brian S. Caffo, Liu Dongmei, Robert Scharpf, and Giovanni Parmigiani

PDF

AN OVERVIEW OF OBSERVATIONAL SLEEP RESEARCH WITH APPLICATION TO SLEEP STAGE TRANSITIONING, Brian S. Caffo, B. Swihart, A. Laffan, C. Crainiceanu, and N. Punjabi

PDF

Bayesian Model Averaging for Clustered Data: Imputing Missing Daily Air Pollution Concentration, Howard H. Chang, Francesca Dominici, and Roger D. Peng

PDF

GENERALIZED MULTILEVEL FUNCTIONAL REGRESSION, Ciprian M. Crainiceanu, Ana-Maria Staicu, and Chongzhi Di

PDF

Multilevel Latent Class Models with Dirichlet Mixing Distribution, Chongzhi Di and Karen Bandeen-Roche

PDF

GEOSTATISTICAL INFERENCE UNDER PREFERENTIAL SAMPLING, Peter J. Diggle, Raquel Menezes, and Ting-li Su

PDF

MODEL SELECTION AND HEALTH EFFECT ESTIMATION IN ENVIRONMENTAL EPIDEMIOLOGY, Francesca Dominici, Chi Wang, Ciprian Crainiceanu, and Giovanni Parmigiani

PDF

A NOVEL AND SIMPLE RULE OF THUMB FOR MULTIPLICITY CONTROL IN EQUIVALENCE TESTING USING TWO ONE-SIDED TESTS, Carolyn Lauzon and Brian S. Caffo

PDF

JOINTLY MODELING CONTINUOUS AND BINARY OUTCOMES FOR BOOLEAN OUTCOMES: AN APPLICATION TO MODELING HYPERTENSION, Xianbin Li, Brian S. Caffo, and Elizabeth Stuart

PDF

BAYESIAN INFERENCE FOR SMOKING CESSATION WITH A LATENT CURE STATE, Sheng Luo, Ciprian M. Crainiceanu, Thomas A. Louis, and Nilanjan Chatterjee

PDF

LEARNING FROM NEAR MISSES IN MEDICATION ERRORS: A BAYESIAN APPROACH, Jessica A. Myers, Francesca Dominici, and Laura Morlock

PDF

DESIGN AND ANALYSIS ISSUES IN GENOME-WIDE SOMATIC MUTATION STUDIES OF CANCER, Giovanni Parmigiani, Simina Boca, Jimmy Lin, Kenneth W. Kinzler, Victor E. Velculescu, and Bert Vogelstein

PDF

A Method for Visualizing Multivariate Time Series Data, Roger D. Peng

PDF

Caching and Distributing Statistical Analyses in R, Roger D. Peng

PDF

Spatial Misalignment in time series studies of air pollution and health data, Roger D. Peng and Michelle L. Bell

PDF

ANALYSIS OF SUBGROUP EFFECTS IN RANDOMIZED TRIALS WHEN SUBGROUP MEMBERSHIP IS INFORMATIVELY MISSING: APPLICATION TO THE MADIT II STUDY, Daniel O. Scharfstein, Georgiana Onicescu, and Steven Goodman

PDF

ON THE MERITS OF VOXEL-BASED MORPHOMETRIC PATH-ANALYSIS FOR INVESTIGATING VOLUMETRIC MEDIATION OF A TOXICANT'S INFLUENCE ON COGNITIVE FUNCTION, Shu-chih Su, Brian S. Caffo, Lynn E. Eberly, Elizabeth Garrett-Mayer, Walter F. Stewart, Sining Chen, David Yousem, Christos Davatzikos, and Brian Schwartz

PDF

A BAYESIAN APPROACH TO EFFECT ESTIMATION ACCOUNTING FOR ADJUSTMENT UNCERTAINTY, Chi Wang, Giovanni Parmigiani, Ciprian Crainiceanu, and Francesca Dominici

PDF

Estimating the Causal Effect of Lower Tidal Volume Ventilation on Survival in Patients with Acute Lung Injury, Weiwei Wang, Daniel Scharfstein, Roy Brower, and Dale Needham

PDF

Causal Inference in Observational Studies with Outcome-Dependent Sampling, Weiwei Wang, Daniel Scharfstein, Zhiqiang Tan, and Ellen J. MacKenzie

PDF

STATISTICAL METHODS FOR AUTOMATED DRUG SUSCEPTIBILITY TESTING: BAYESIAN MINIMUM INHIBITORY CONCENTRATION PREDICTION FROM GROWTH CURVES, Xi Zhou, Merlise A. Clyde, James Garrett, Viridiana Lourdes, Michael O'Connell, Giovanni Parmigiani, David J. Turner, and Tim Wiles

Papers from 2007

PDF

A BAYESIAN HIERARCHICAL FRAMEWORK FOR SPATIAL MODELING OF fMRI DATA, F. DuBois Bowman, Brian S. Caffo, Susan Spear Bassett, and Clinton Kilts

PDF

FORECASTING THE GLOBAL BURDEN OF ALZHEIMER'S DISEASE, Ron Brookmeyer, Elizabeth Johnson, Kathryn Ziegler-Graham, and H. Michael Arrighi

PDF

IS MRI-BASED VOLUME A MEDIATOR OF THE ASSOCIATION OF CUMULATIVE LEAD DOSE WITH COGNITIVE FUNCTION?, Brian S. Caffo, Sining Chen, Walter Stewart, Karen Bolla, David Yousem, Christos Davatzikos, and Brian S. Schwartz

PDF

A CASE STUDY IN PHARMACOLOGIC IMAGING USING PRINCIPAL CURVES IN SINGLE PHOTON EMISSION COMPUTED TOMOGRAPHY, Brian S. Caffo, Ciprian M. Crainiceanu, Lijuan Deng, and Craig W. Hendrix

PDF

A SURVEY OF THE LIKELIHOOD APPROACH TO BIOEQUIVALENCE TRIALS, Leena Choi, Brian S. Caffo, and Charles Rohde

PDF

RANDOM EFFECTS MODELS IN A META-ANALYSIS OF THE ACCURACY OF DIAGNOSTIC TESTS WITHIN A GOLD STANDARD IN THE PRESENCE OF MISSING DATA, Haitao Chu, Sining Chen, and Thomas A. Louis

PDF

The Integrative Correlation Coefficient: a Measure of Cross-study Reproducibility for Gene Expressionea Array Data, Leslie M. Cope, Liz Garrett-Mayer, Edward Gabrielson, and Giovanni Parmigiani

PDF

Bayesian Analysis for Penalized Spline Regression Using Win BUGS, Ciprian M. Crainiceanu, David Ruppert, and M.P. Wand

PDF

IDENTIFYING EFFECT MODIFIERS IN AIR POLLUTION TIME-SERIES STUDIES USING A TWO-STAGE ANALYSIS, Sandrah P. Eckel and Thomas A. Louis

PDF

ASSESSING THE UNRELIABILITY OF THE MEDICAL LITERATURE: A RESPONSE TO "WHY MOST PUBLISHED RESEARCH FINDINGS ARE FALSE", Steven Goodman and Sander Greenland

PDF

MULTIPLE MODEL EVALUATION ABSENT THE GOLD STANDARD VIA MODEL COMBINATION, Edwin J. Iversen, Jr.; Giovanni Parmigiani; and Sining Chen

PDF

TRENDS IN PARTICULATE MATTER AND MORTALITY: AN APPROACH TO THE ASSESSMENT OF UNMEASURED CONFOUNDING, Holly Janes, Francesca Dominici, and Scott Zeger

PDF

MULTIPLE DISEASES IN CARRIER PROBABILITY ESTIMATION: ACCOUNTING FOR SURVIVING ALL CANCERS OTHER THAN BREAST AND OVARY IN BRCAPRO, Hormuzd A. Katki, Amanda Blackford, Sining Chen, and Giovanni Parmigiani

PDF

FAST ADAPTIVE PENALIZED SPLINES, Tatyana Krivobokova, Ciprian M. Crainiceanu, and Goran Kauermann

PDF

EFFECTIVE COMMUNICATION OF STANDARD ERRORS AND CONFIDENCE INTERVALS, Thomas A. Louis and Scott L. Zeger

PDF

DECOMPOSITION OF REGRESSION ESTIMATORS TO EXPLORE THE INFLUENCE OF "UNMEASURED" TIME-VARYING CONFOUNDERS, Yun Lu and Scott L. Zeger

PDF

OPTIMAL PROPENSITY SCORE STRATIFICATION, Jessica A. Myers and Thomas A. Louis

PDF

TRAB: TESTING WHETHER MUTATION FREQUENCIES ARE ABOVE AN UNKNOWN BACKGROUND, Giovanni Parmigiani, Sining Chen, and Victor E. Velculescu

PDF

STATISTICAL METHODS FOR THE ANALYSIS OF CANCER GENOME SEQUENCING DATA, Giovanni Parmigiani, J. Lin, Simina Boca, T. Sjoblom, K.W. Kinzler, V.E. Velculescu, and B. Vogelstein

PDF

A REPRODUCIBLE RESEARCH TOOLKIT FOR R, Roger Peng

PDF

A BAYESIAN HIERARCHICAL MODEL FOR CONSTRAINED DISTRIBUTED LAG FUNCTIONS: ESTIMATING THE TIME COURSE OF HOSPITALIZATION ASSOCIATED WITH AIR POLLUTION EXPOSURE, Roger Peng, Francesca Dominici, and Leah J. Welty

PDF

DISTRIBUTED REPRODUCIBLE RESEARCH USING CACHED COMPUTATIONS, Roger Peng and Sandrah P. Eckel

PDF

SEMIPARAMETRIC BIVARIATE QUANTILE-QUANTILE REGRESSION FOR ANALYZING SEMI-COMPETING RISKS DATA, Daniel O. Scharfstein, James M. Robins, and Mark van der Laan

PDF

A HIDDEN MARKOV MODEL FOR JOINT ESTIMATION OF GENOTYPE AND COPY NUMBER IN HIGH-THROUGHPUT SNP CHIPS, Robert B. Scharpf, Giovanni Parmigiani, Jonathan Pevnser, and Ingo Ruczinski

PDF

A BAYESIAN MODEL FOR CROSS-STUDY DIFFERENTIAL GENE EXPRESSION, Robert B. Scharpf, Hakon Tjelemeland, Giovanni Parmigiani, and Andrew B. Nobel

PDF

INFERENCE FOR SURVIVAL CURVES WITH INFORMATIVELY COARSENED DISCRETE EVENT-TIME DATA: APPLICATION TO ALIVE, Michelle Shardell, Daniel O. Scharfstein, David Vlahov, and Noya Galai

PDF

MODIFIED TEST STATISTICS BY INTER-VOXEL VARIANCE SHRINKAGE WITH AN APPLICATION TO fMRI, Shu-chih Su, Brian Caffo, Elizabeth Garrett-Mayer, and Susan Bassett

PDF

MORTALITY IN THE MEDICARE POPULATION AND CHRONIC EXPOSURE TO FINE PARTICULATE AIR POLLUTION , Scott L. Zeger, Francesca Dominici, Aidan McDermott, and Jonathan M. Samet

PDF

OPTIMIZED CROSS-STUDY ANALYSIS OF MICROARRAY-BASED PREDICTORS, Xiaogang Zhong, Luigi Marchionni, Leslie Cope, Edwin S. Iversen, Elizabeth S. Garrett-Mayer, Edward Gabrielson, and Giovanni Parmigiani

PDF

A SMOOTHING APPROACH TO DATA MASKING, Yijie Zhous, Francesca Dominici, and Thomas A. Louis

PDF

RACIAL DISPARITIES IN MORTALITY RISKS IN A SAMPLE OF THE U.S. MEDICARE POPULATION, Yijie Zhou, Francesca Dominici, and Thomas A. Louis

Papers from 2006

PDF

USE OF HIDDEN MARKOV MODELS FOR QTL MAPPING, Karl W. Broman

PDF

A FLEXIBLE GENERAL CLASS OF MARGINAL AND CONDITIONAL RANDOM INTERCEPT MODELS FOR BINARY OUTCOMES USING MIXTURES OF NORMALS, Brian Caffo, Ming-Wen An, and Charles A. Rohde

PDF

EXPLORATION, NORMALIZATION, AND GENOTYPE CALLS OF HIGH DENSITY OLIGONUCLEOTIDE SNP ARRAY DATA, Benilton Carvalho, Terence P. Speed, and Rafael A. Irizarry

PDF

BIVARIATE BINOMIAL SPATIAL MODELLING LOA loa PREVALENCE IN TROPICAL AFRICA, Ciprian M. Crainiceanu, Peter J. Diggle, and Barry Rowlingson

PDF

Adjustment Uncertainty in Effect Estimation, Ciprian M. Crainiceanu, Francesca Dominici, and Giovanni Parmigiani

PDF

COX MODELS WITH NONLINEAR EFFECT OF COVARIATES MEASURED WITH ERROR: A CASE STUDY OF CHRONIC KIDNEY DISEASE INCIDENCE, Ciprian M. Crainiceanu, David Ruppert, and Josef Coresh

PDF

PENALIZED LIKELIHOOD AND BAYESIAN METHODS FOR SPARSE CONTINGENCY TABLES: AN ANALYSIS OF ALTERNATIVE SPLICING IN FULL-LENGTH cDNA LIBRARIES, Corinne Dahinden, Giovanni Parmigiani, Mark C. Emerick, and Peter Buhlmann

PDF

INTERACTING WITH LOCAL AND REMOTE DATA RESPOSITORIES USING THE stashR PACKAGE, Sandrah P. Eckel and Roger Peng

PDF

A Comparative Analysis of the Chronic Effects of Fine Particulate Matter, Sorina E. Eftim, Holly Janes, Aidan McDermott, Jonathan M. Samet, and Francesca Dominici

PDF

INVESTIGATING MEDIATION WHEN COUNTERFACTUALS ARE NOT METAPHYSICAL: DOES SUNLIGHT UVB EXPOSURE MEDIATE THE EFFECT OF EYEGLASSES ON CATARACTS?, Brian Egleston, Daniel O. Scharfstein, Beatriz Munoz, and Sheila West

PDF

MULTIVARIATE ANALYSIS AND VISUALIZATION OF SPLICING CORRELATIONS IN SINGLE-GENE TRANSCRIPTOMES, Mark C. Emerick, Giovanni Parmigiani, and William S. Agnew

PDF

PRINCIPAL STRATIFICATION DESIGNS TO ESTIMATE INPUT DATA MISSING DUE TO DEATH, Constantine E. Frangakis, Donald B. Rubin, Ming-Wen An, and Ellen MacKenzie

PDF

FEATURE-LEVEL EXPLORATION OF THE CHOE ET AL. AFFYMETRIX GENECHIP CONTROL DATASET, Rafael A. Irizarry, Leslie Cope, and Zhijin Wu

PDF

ON THE POTENTIAL FOR ILL-LOGIC WITH LOGICALLY DEFINED OUTCOMES, Xianbin Li, Brian S. Caffo, and Daniel O. Scharfstein

PDF

RECURRENT EVENT MODELS IN THE PRESENCE OF A TERMINAL EVENT: COMPARISON, INFERENCE AND DATA ANALYSIS, Xianghua Luo and Mei-Cheng Wang

PDF

ON THE EQUIVALENCE OF CASE-CROSSOVER AND TIME SERIES METHODS IN ENVIRONMENTAL EPIDEMIOLOGY, Yun Lu and Scott L. Zeger

PDF

POOR PERFORMANCE OF BOOTSTRAP CONFIDENCE INTERVALS FOR THE LOCATION OF A QUANTITATIVE TRAIT LOUCS, Ani Manichaikul, Josee Dupuis, Saunak Sen, and Karl W. Broman

PDF

FDR and Bayesian Multiple Comparisons Rules, Peter Muller, Giovanni Parmigiani, and Kenneth Rice

PDF

INTERACTING WITH DATA USING THE FILEHASH PACKAGE FOR R, Roger Peng

PDF

GAMMA SHAPE MIXTURES FOR HEAVY-TAILED DISTRIBUTIONS, Sergio Venturini, Francesca Dominici, and Giovanni Parmigiani

PDF

ESTIMATING GENOME-WIDE COPY NUMBER USING ALLELE SPECIFIC MIXTURE MODELS, Wenyi Wang , Benilton Caravalho, Nate Miller, Jonathan Pevsner, Aravinda Chakravarti, and Rafael A. Irizarry

Papers from 2005

PDF

NONPARAMETRIC ESTIMATION OF BIVARIATE FAILURE TIME ASSOCIATIONS IN THE PRESENCE OF A COMPETING RISK, Karen Bandeen-Roche and Jing Ning

PDF

A User-Friendly Introduction to Link-Probit-Normal Models, Brian S. Caffo and Michael Griswold

PDF

Additive Hazards Models with Latent Treatment Effectiveness Lag Time, Ying Qing Chen, Charles A. Rohde, and Mei-Cheng Wang

PDF

A Mechanistic Latent Variable Model for Estimating Drug Concentrations in the Male Genital Tract, Leena Choi, Brian Caffo, Charles A. Rohde, Themba T. Ndovi, and Craig W. Hendrix

PDF

Analysis of Affymetrix GeneChip Data Using Amplified RNA, Leslie Cope, Scott M. Hartman, Hinrich W.H. Gohlmann, Jay P. Tiesman, and Rafael A. Irizarry

PDF

ON THE USE OF NON-EUCLIDEAN ISOTROPY IN GEOSTATISTICS, Frank C. Curriero

PDF

Searching for Differentially Expressed Gene Combinations, Marcel Dettling, Edward Gabrielson, and Giovanni Parmigiani

PDF

A Partial Likelihood for Spatio-temporal Point Processes, Peter J. Diggle

PDF

Spatio-temporal Point Processes: Methods and Applications, Peter J. Diggle

PDF

Does the Effect of Micronutrient Supplementation on Neonatal Survival Vary with Respect to the Percentiles of the Birth Weight Distribution?, Francesca Dominici, Scott L. Zeger, Giovanni Parmigiani, Joanne Katz, and Parul Christian

PDF

THE ROLE OF AN EXPLICIT CAUSAL FRAMEWORK IN AFFECTED SIB PAIR DESIGNS WITH COVARIATES , Constantine E. Frangakis, Fan Li, and Betty Q. Doan

PDF

Understanding the Continual Reassessment Method for Dose Finding Studies: An Overview for Non-Statisticians, Elizabeth Garrett-Mayer

PDF

MODELING DIFFERENTIATED TREATMENT EFFECTS FOR MULTIPLE OUTCOMES DATA, Hongfei Guo and Karen Bandeen-Roche

PDF

Analyzing Panel Count Data with Informative Observation Times, Chiung-Yu Huang, Mei-Cheng Wang, and Ying Zhang

PDF

Comparison of Affymetrix GeneChip Expression Measures, Rafael A. Irizarry, Zhijin Wu, and Harris A. Jaffee

PDF

Fixed-Width Output Analysis for Markov Chain Monte Carlo, Galin L. Jones, Murali Haran, Brian S. Caffo, and Ronald Neath

PDF

Designs in Partially Controlled Studies: Messages from a Review, Fan Li and Constantine E. Frangakis

PDF

Polydesigns and Causal Inference, Fan Li and Constantine E. Frangakis