The COBRA Preprint Series contains manuscripts by top researchers, many of which will eventually appear in leading journals. For a look ahead to tomorrow's cutting edge research, browse the papers below.

#### Submissions from 2016

hpcNMF: A high-performance toolbox for non-negative matrix factorization, Karthik Devarajan and Guoli Wang

#### Submissions from 2015

Distance Correlation Measures Applied to Analyze Relation between Variables in Liver Cirrhosis Marker Data, Atanu Bhattacharjee Dr.

A Simple Method to Estimate the Time-dependent ROC Curve Under Right Censoring, Liang Li, Bo Hu, and Tom Greene

#### Submissions from 2014

Bayesian Model Averaging:- An Application in Cancer Clinical Trial, Atanu Bhattacharjee

Confidence intervals for the treatment effect on the treated, José A. Ferreira

Pre-maceration, Saignée and Temperature affect Daily Evolution of Pigment Extraction During Vinification, Ottorino L. Pantani, Federico M. Stefanini, Irene Lozzi, Luca Calamai, Alessandra Biondi Bartolini, and Stefano Di Blasi

Computational model for survey and trend analysis of patients with endometriosis : a decision aid tool for EBM, Salvo Reina, Vito Reina, Franco Ameglio, Mauro Costa, and Alessandro Fasciani

Methods for Exploring Treatment Effect Heterogeneity in Subgroup Analysis: An Application to Global Clinical Trials, I. Manjula Schou and Ian C. Marschner

Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany, Zhu Wang, Shuangge Ma, and Ching-Yun Wang

#### Submissions from 2013

Augmentation of Propensity Scores for Medical Records-based Research, Mikel Aickin

Regression Trees for Longitudinal Data, Madan Gopal Kundu and Jaroslaw Harezlak

A Bayesian regression tree approach to identify the effect of nanoparticles properties on toxicity profiles, Cecile Low-Kam, Haiyuan Zhang, Zhaoxia Ji, Tian Xia, Jeffrey I. Zinc, Andre Nel, and Donatello Telesca

#### Submissions from 2012

Estimating HIV prevalence in the presence of spatial variation, Matthew D. Austin, Christopher D. Barr, and Victor DeGruttola

A prior-free framework of coherent inference and its derivation of simple shrinkage estimators, David R. Bickel

Relating Nanoparticle Properties to Biological Outcomes in Exposure Escalation Experiments, Trina Patel, Cecile Low-Kam, Zhaoxia Ji, Haiyuan Zhang, Tian Xia, Andre E. Nel, Jeffrey I. Zinc, and Donatello Telesca

Hierarchical Rank Aggregation with Applications to Nanotoxicology, Trina Patel, Donatello Telesca, Robert Rallo, Saji George, Xia Tian, and Nel Andre

Quantifying alternative splicing from paired-end RNA-sequencing data, David Rossell, Camille Stephan-Otto Attolini, Manuel Kroiss, and Almond Stöcker

Robust Estimation of Pure/Natural Direct Effects with Mediator Measurement Error, Eric J. Tchetgen Tchetgen and Sheng Hsuan Lin

On Identification of Natural Direct Effects when a Confounder of the Mediator is Directly Affected by Exposure, Eric J. Tchetgen Tchetgen and Tyler J. VanderWeele

Robustness of Measures of Interaction to Unmeasured Confounding, Eric J. Tchetgen Tchetgen and Tyler J. VanderWeele

Differential Patterns of Interaction and Gaussian Graphical Models, Masanao Yajima, Donatello Telesca, Yuan Ji, and Peter Muller

PLS-ROG: Partial least squares with rank order of groups, Hiroyuki Yamamoto

Statistical hypothesis test of factor loading in principal component analysis and its application to metabolite set enrichment analysis, Hiroyuki Yamamoto, Tamaki Fujimori, Hajime Sato, Gen Ishikawa, Kenjiro Kami, and Yoshiaki Ohashi

Why odds ratio estimates of GWAS are almost always close to 1.0, Yutaka Yasui

#### Submissions from 2011

A unified approach to non-negative matrix factorization and probabilistic latent semantic indexing, Karthik Devarajan, Guoli Wang, and Nader Ebrahimi

Evaluation of Flexible Regression for Non-unimodal Hazard Functions, Marco Fornili, Patrizia Boracchi, Federico Ambrogi, and Elia Biganzoli

Propensity Score Analysis with Matching Weights, Liang Li

On the Nondifferential Misclassification of a Binary Confounder, Elizabeth L. Ogburn and Tyler J. VanderWeele

Toxicity Profiling of Engineered Nanomaterials via Multivariate Dose Response Surface Modeling, Trina Patel, Donatello Telesca, Saji George, and Andre Nel

A proof of Bell's inequality in quantum mechanics using causal interactions, James M. Robins, Tyler J. VanderWeele, and Richard D. Gill

Modeling Criminal Careers as Departures from a Unimodal Population Age-Crime Curve: The Case of Marijuana Use, Donatello Telesca, Elena Erosheva, Derek Kreager, and Ross Matsueda

Modeling Protein Expression and Protein Signaling Pathways, Donatello Telesca, Peter Muller, Steven Kornblau, Marc Suchard, and Yuan Ji

Causal inference under multiple versions of treatment, Tyler J. VanderWeele and Miguel A. Hernan

On the definition of a confounder, Tyler J. VanderWeele and Ilya Shpitser

Components of the indirect effect in vaccine trials: identification of contagion and infectiousness effects, Tyler J. VanderWeele, Eric J. Tchetgen, and M. Elizabeth Halloran

A Bayesian Model Averaging Approach for Observational Gene Expression Studies, Xi Kathy Zhou, Fei Liu, and Andrew J. Dannenberg

#### Submissions from 2010

A Bayesian shared component model for genetic association studies, Juan J. Abellan, Carlos Abellan, and Juan R. Gonzalez

Recovery of the Baseline Incidence Density in Censored Time-to-Event Analysis, Mikel Aickin

The Linkset Model for 2^n Contingency Tables, Mikel Aickin

The Strength of Statistical Evidence for Composite Hypotheses: Inference to the Best Explanation, David R. Bickel

Efficient Design and Inference for Multi-stage Randomized Trials of Individualized Treatment Policies, Ree Dawson and Philip W. Lavori

The use of multiple imputation in molecular epidemiologic studies assessing interaction effects, Manisha Desai, Denise Esserman, Marilie Gammon, and Mary Beth Terry

The handling of missing data in molecular epidemiologic studies, Manisha Desai, Jessica Kubo, Denise Esserman, and Mary Beth Terry

Improving statistical analysis of prospective clinical trials in stem cell transplantation. An inventory of new approaches in survival analysis, Aurelien Latouche

Minimum Description Length and Empirical Bayes Methods of Identifying SNPs Associated with Disease, Ye Yang and David R. Bickel

Minimum Description Length Measures of Evidence for Enrichment, Zhenyu Yang and David R. Bickel

#### Submissions from 2009

Simple, Defensible Sample Sizes Based on Cost Efficiency -- With Discussion and Rejoinder, Peter Bacchetti, Charles E. McCulloch, Mark R. Segal, Richard Simon, Peter Muller, Gary L. Rosner, James A. Hanley, and Stan Shapiro

Reliability of the Model for Clustering of Longitudinal datasets of Infant Mortality Rate in India, Ajay Kumar Bansal and S D. Sharma

Validation of Differential Gene Expression Algorithms: Application Comparing Fold Change Estimation to Hypothesis Testing, David R. Bickel and Corey M. Yanofsky

Two-stage Decompositions for the Analysis of Functional Connectivity for fMRI With Application to Alzheimer's Disease Risk, Brian S. Caffo, Ciprian M. Crainiceanu, Guillermo Verduzco, Stewart H. Mostofsky, Susan Spear-Bassett, and James J. Pekar

Nonparametric Incidence Estimation From Prevalent Cohort Survival Data, Marco Carone, Masoud Asgharian, and Mei-Cheng Wang

Composite Likelihood EM Algorithm with Applications to Multivariate Hidden Markov Model , Xin Gao and Peter Xuekun Song

Fitting ACE Structural Equation Models to Case-Control Family Data, Kristin N. Javaras, James I. Hudson, and Nan M. Laird

Targeted Genomic signature profiling with Quasi-alignment statistics, Rao Mallik Kotamarti, Douglas W. Raiford, Michael Hahsler, Yuhang Wang, Monnie McGee, and Maggie Dunham

Shrinkage Estimation of Expression Fold Change As an Alternative to Testing Hypotheses of Equivalent Expression, Zahra Montazeri, Corey M. Yanofsky, and David R. Bickel

A Novel Topology for Representing Protein Folds, Mark R. Segal

Modeling Multilevel Sleep Transitional Data Via Poisson Log-Linear Multilevel Models, Bruce J. Swihart

Mean Survival Time from Right Censored Data, Ming Zhong and Kenneth R. Hess

Correlated Binary Regression Using Orthogonalized Residuals, Richard C. Zink and Bahjat F. Qaqish

#### Submissions from 2008

The Strength of Statistical Evidence for Composite Hypotheses with an Application to Multiple Comparisons, David R. Bickel

Reversal in declining trend of adult mortality in many states of India, 1970-2001: Is it due to AIDS?, Abhaya Indrayan and Ajay Kumar Bansal

A SIMPLE INDEX OF SMOKING, Abhaya Indrayan Dr., Rajeev Kumar Mr., and Shridhar Dwivedi Dr.

Change-point Problem and Regression: An Annotated Bibliography, Ahmad Khodadadi and Masoud Asgharian

The Calculation of the 97.5% Upper Confidence Bound: Application to Clustered Binary Data in a Binomial Non-Inferiority Two-Sample Trial., William F. McCarthy

The Design and Sample Size Requirement for a Cluster Randomized Non-Inferiority Trial with Two Binary Co-Primary Outcomes., William F. McCarthy

Joint Spatial Modeling of Recurrent Infection and Growth with Processes Under Intermittent Observation, Farouk S. Nathoo

Space-Time Regression Modeling of Tree Growth Using the Skew-t Distribution, Farouk S. Nathoo

Detection of Recurrent Copy Number Alterations in the Genome: a Probabilistic Approach, Oscar M. Rueda and Ramon Diaz-Uriarte

Finding Recurrent Regions of Copy Number Variation: A Review, Oscar M. Rueda and Ramon Diaz-Uriarte

A New Method for Constructing Exact Tests without Making any Assumptions, Karl H. Schlag

Bringing Game Theory to Hypothesis Testing: Establishing Finite Sample Bounds on Inference, Karl H. Schlag

Properties of Monotonic Effects on Directed Acyclic Graphs, Tyler J. VanderWeele and James M. Robins

Properties of Monotonic Effects on Directed Acyclic Graphs, Tyler J. VanderWeele and James M. Robins

#### Submissions from 2007

Improving GSEA for Analysis of Biologic Pathways for Differential Gene Expression across a Binary Phenotype , Irina Dinu, John D. Potter, Thomas Mueller, Qi Liu, Adeniyi J. Adewale, Gian S. Jhangri, Gunilla Einecke, Konrad S. Famulski, Philip Halloran, and Yutaka Yasui

Bootstrap Confidence Regions for Optimal Operating Conditions in Response Surface Methodology, Roger D. Gibb, I-Li Lu, and Walter H. Carter Jr

Estimating the Prevalence of Disease Using Relatives of Case and Control Probands, Kristin N. Javaras, Nan M. Laird, James I. Hudson, and Brian D. Ripley

Adjustment to the McNemar’s Test for the Analysis of Clustered Matched-Pair Data, William F. McCarthy

An Example of How to Write the Statistical Section of a Bioequivalence Study Protocol for FDA Review, William F. McCarthy

Assessment of Sample Size and Power for the Analysis of Clustered Matched-Pair Data, William F. McCarthy

Lachenbruch’s Method for Determining the Sample Size Required for Testing Interactions: How It Compares to nQuery Advisor and O’Brien’s SAS UnifyPow., William F. McCarthy

The Existence of Maximum Likelihood Estimates for the Binary Response Logistic Regression Model, William F. McCarthy

The Assessment of the Degree of Concordance Between the Observed Values and the Predicted Values of a Mixed-Effect Model Using “Method of Comparison” Techniques, William F. McCarthy and Nan Guo

The Analysis of Pixel Intensity (Myocardial Signal Density) Data: The Quantification of Myocardial Perfusion by Imaging Methods., William F. McCarthy and Douglas R. Thompson

Coronary Evaluation Using Multi-detector Spiral Computed Tomography Angiography: Statistical Design and Analysis, William F. McCarthy, Douglas R. Thompson, and Bruce A. Barton

Estimation of Dose-Response Functions for Longitudinal Data, Erica E M Moodie and David A. Stephens

Review of the Maximum Likelihood Functions for Right Censored Data. A New Elementary Derivation., Stefano Patti, Elia Biganzoli, and Patrizia Boracchi

False Discovery Rate Analysis of Brain Diffusion Direction Maps, Armin Schwartzman, Robert F. Dougherty, and Jonathan E. Taylor

A Bayesian hierarchical model for spot fluorescence in microarrays, Federico Mattia Stefanini

A Flexible Semi-Parametric Approach to Estimating a Dose-Response Relationship: the Treatment of Childhood Amblyopia. , David A. Stephens and Erica E M Moodie

#### Submissions from 2006

Crude Cumulative Incidence in the form of a Horvitz-Thompson like and Kaplan-Meier like Estimator, Laura Antolini, Elia Mario Biganzoli, and Patrizia Boracchi

Exploration of distributional models for a novel intensity-dependent normalization , Nicola Lama, Patrizia Boracchi, and Elia Mario Biganzoli

New Spiked-In Probe Sets for the Affymetrix HGU-133A Latin Square Experiment, Monnie McGee and Zhongxue Chen

Survival Analysis of Longitudinal Microarrays, Natasa Rajicic, Dianne M. Finkelstein, and David A. Schoenfeld

Simple Records Matching Method for diagnostic and clinical datasets of patient’s records, Salvo Reina, Vito M. Reina, and Eugenio A. Debbia

Causal Comparisons in Randomized Trials of Two Active Treatments: The Effect of Supervised Exercise to Promote Smoking Cessation, Jason Roy and Joseph W. Hogan

A Flexible Statistical Method for Detecting Genomic Copy-Number Changes Using Hidden Markov Models with Reversible Jump MCMC , Oscar M. Rueda and Ramon Diaz-Uriarte

Biologic Interaction and Their Identification, Tyler J. VanderWeele and James Robins

Properties of Monotonic Effects, Tyler J. VanderWeele and James M. Robins

A Theory of Sufficient Cause Interactions, Tyler J. VanderWeele and James M. Robins

A unifying approach for haplotype analysis of quantitative traits in family-based association studies: Testing and estimating gene-environment interactions with complex exposure variables, Stijn Vansteelandt and Christoph Lange