Welcome to the University of Pennsylvania Biostatistics Working Papers site. The papers available under this site include a selection of recent research conducted by our Penn Biostatistics faculty. Formed in 1997, the Division of Biostatistics in the Department of Biostatistics and Epidemiology within the Penn School of Medicine has grown to its current size of 25 faculty, with expected further expansion. In addition to conducting methodological research, the Penn Biostatistics faculty plays a crucial role in the collaborative research process, from design of the research plan through analysis and publication of study results. Both our methodological and collaborative research efforts are primarily funded through grants from the National Institute of Heath and private industries. The Division also holds several training grants from NIH to prepare our next generation of biostatisticians. Further information about the department may be found at dbe.med.upenn.edu/biostat-research/.


Submissions from 2016


Maximum Likelihood Based Analysis of Equally Spaced Longitudinal Count Data with Specified Marginal Means, First-order Antedependence, and Linear Conditional Expectations, Victoria Gamerman, Matthew Guerra, and Justine Shults


Distance-Based Analysis of Variance for Brain Connectivity, Russell T. Shinohara, Haochang Shou, Marco Carone, Robert Schultz, Birkan Tunc, Drew Parker, and Ragini Verma


Simulating Longer Vectors of Correlated Binary Random Variables via Multinomial Sampling, Justine Shults


Interpretable High-Dimensional Inference Via Score Maximization with an Application in Neuroimaging, Simon N. Vandekar, Philip T. Reiss, and Russell T. Shinohara

Submissions from 2015


Removing inter-subject technical variability in magnetic resonance imaging studies, Jean-Philippe Fortin, Elizabeth M. Sweeney, John Muschelli, Ciprian M. Crainiceanu, Russell T. Shinohara, and Alzheimer’s Disease Neuroimaging Initiative


Nonparametric methods for doubly robust estimation of continuous treatment effects, Edward H. Kennedy, Zongming Ma, Matthew D. McHugh, and Dylan S. Small


Addressing Confounding in Predictive Models with an Application to Neuroimaging, Kristin A. Linn, Bilwaj Gaonkar, Jimit Doshi, Christos Davatzikos, and Russell T. Shinohara


Control-Group Feature Normalization for Multivariate Pattern Analysis Using the Support Vector Machine, Kristin A. Linn, Bilwaj Gaonkar, Jimit Doshi, Christos Davatzikos, and Russell T. Shinohara


Statistical Estimation of T1 Relaxation Time Using Conventional Magnetic Resonance Imaging, Amanda Mejia, Elizabeth M. Sweeney, Blake Dewey, Govind Nair, Pascal Sati, Colin Shea, Daniel S. Reich, and Russell T. Shinohara


Statistical estimation of white matter microstructure from conventional MRI, Leah Suttner, Amanda Mejia, Blake Dewey, Pascal Sati, Daniel S. Reich, and Russell T. Shinohara

Submissions from 2014


Regression modeling of longitudinal binary outcomes with outcome-dependent observation times, Kay See Tan, Andrea B. Troxel, Stephen E. Kimmel, Kevin G. Volpp, and Benjamin French

Submissions from 2013


On the Simulation of Longitudinal Discrete Data with Specified Marginal Means and First-Order Antedependence, Matthew Guerra and Justine Shults


Normalization Techniques for Statistical Inference from Magnetic Resonance Imaging, Russell T. Shinohara, Elizabeth M. Sweeney, Jeff Goldsmith, Navid Shiee, Farrah J. Mateen, Peter A. Calabresi, Samson Jarso, Dzung L. Pham, Daniel S. Reich, and Ciprian M. Crainiceanu

Submissions from 2010


Bayesian Methods for Network-Structured Genomics Data, Stefano Monni and Hongzhe Li

Submissions from 2009


A Hidden Markov Random Field Model for Genome-wide Association Studies, Hongzhe Li, Zhi Wei, and J M. Maris


"Implementation of quasi-least squares With the R package qlspack", Jichun Xie and Justine Shults


Quasi-Least Squares with Mixed Linear Correlation Structures, Jichun Xie, Justine Shults, Jon Peet, Dwight Stambolian, and Mary F. Cotch

Submissions from 2008


On the designation of the patterned associations for longitudinal Bernoulli data: weight matrix versus true correlation structure?, Hanjoo Kim, Joseph M. Hilbe, and Justine Shults


"%QLS SAS Macro: A SAS macro for Analysis of Longitudinal Data Using Quasi-Least Squares"., Hanjoo Kim and Justine Shults


Analysis of Adverse Events in Drug Safety: A Multivariate Approach Using Stratified Quasi-least Squares, Hanjoo Kim, Justine Shults, Scott Patterson, and Robert Goldberg-Alberts


A Network-constrained Empirical Bayes Method for Analysis of Genomic Data, Caiyan Li, Zhi Wei, and Hongzhe Li


U-Statistics-based Tests for Multiple Genes in Genetic Association Studies, Zhi Wei, Mingyao Li PhD, Timothy Rebbeck, and Hongzhe Li


Incorporation of Genetic Pathway Information into Analysis of Multivariate Gene Expression Data, Zhi Wei, Jane E. Minturn, Eric Rappaport, Garrett Brodeur, and Hongzhe Li

Submissions from 2007


Methodological Issues in the Study of the Effects of Hemoglobin Variability, Marshall Joffe, Wei Yang, Steve Brunelli, and Harold Feldman


Network-constrained Regularization and Variable Selection for Analysis of Genomic Data, Caiyan Li and Hongzhe Li


Statistical Methods for Inference of Genetic Networks and Regulatory Modules, Hongzhe Li


Vertex Clustering in Random Graphs via Reversible Jump Markov Chain Monte Carlo, Stefano Monni and Hongzhe Li


Analysis of multi-level correlated data in the framework of generalized estimating equations via xtmultcorr procedures in Stata and qls functions in Matlab, Justine Shults and Sarah J. Ratcliffe


Group SCAD Regression Analysis for Microarray Time Course Gene Expression Data, Lifeng Wang, Guang Chen, and Hongzhe Li PhD


Variable Selection for Nonparametric Varying-Coefficient Models for Analysis of Repeated Measurements, Lifeng Wang and Hongzhe Li


A Hidden Spatial-temporal Markov Random Field Model for Network-based Analysis of Time Course Gene Expression Data, Zhi Wei and Hongzhe Li


A Markov Random Field Model for Network-based Analysis of Genomic Data, Zhi Wei and HongZhe Li

Submissions from 2006


Conditional Likelihood Methods for Haplotype-based Association Analysis Using Matched Case-Control Data, Jinbo Chen and Carmen Rodriguez


Censored Data Regression in High-Dimension and Low-Sample Size Settings For Genomic Applications, Hongzhe Li


Survival Analysis Methods in Genetic Epidemiology, Hongzhe Li


Longitudinal Nested Compliance Class Model in the Presence of Time-Varying Noncompliance, Julia Y. Lin, Thomas R. TenHave, and Michael R. Elliott


Nested Markov Compliance Class Model in the Presence of Time-Varying Noncompliance, Julia Y. Lin, Thomas R. TenHave, and Michael R. Elliott


Group Additive Regression Models for Genomic Data Analysis, Yihui Luan and Hongzhe Li


Sensitivity of the Hazard Ratio to Non-Ignorable Treatment Assignment in an Observational Study, Nandita Mitra and Daniel F. Heitjan


Improved generalized estimating equation analysis via xtqls for implementation of quasi-least squares in Stata, Justine Shults, Sarah J. Ratcliffe, and Mary Leonard


On The Violation Of Bounds For The Correlation In Generalized Estimating Equation Analyses Of Binary Data From Longitudinal Trials, Justine Shults, Wenguang Sun, Xin Tu, and Jay Amsterdam


Use of Unbiased Estimating Equations to Estimate Correlation in Generalized Estimating Equation Analysis of Longitudinal Trials, Wenguang Sun, Justine Shults, and Mary Leonard


Nonparametric Pathway-Based Regression Models for Analysis of Genomic Data, Zhi Wei and Hongzhe Li

Submissions from 2005


Gradient Directed Regularization for Sparse Gaussian Concentration Graphs, with Applications to Inference of Genetic Networks, Hongzhe Li and Jiang Gui


Casual Mediation Analyses with Structural Mean Models, Thomas R. TenHave, Marshall Joffe, Kevin Lynch, Greg Brown, and Stephen Maisto