"DISTRIBUTED REPRODUCIBLE RESEARCH USING CACHED COMPUTATIONS" by Roger Peng and Sandrah P. Eckel

Johns Hopkins University, Dept. of Biostatistics Working Papers

Title

DISTRIBUTED REPRODUCIBLE RESEARCH USING CACHED COMPUTATIONS

Authors

Roger Peng, Johns Hopkins Bloomberg School of Public Health, Department of BiostatisticsFollow
Sandrah P. Eckel, Johns Hopkins Bloomberg School of Public Health, Department of Biostatistics

Abstract

The ability to make scientific findings reproducible is increasingly important in areas where substantive results are the product of complex statistical computations. Reproducibility can allow others to verify the published findings and conduct alternate analyses of the same data. A question that arises naturally is how can one conduct and distribute reproducible research? This question is relevant from the point of view of both the authors who want to make their research reproducible and readers who want to reproduce relevant findings reported in the scientific literature. We present a framework in which reproducible research can be conducted and distributed via cached computations and describe specific tools for both authors and readers. As a prototype implementation we introduce three software packages written in the R language. The cacheSweave and stashR packages together provide tools for caching computational results in a key-value style database which can be published to a public repository for readers to download. The SRPM package provides tools for generating and interacting with "shared reproducibility packages" (SRPs) which can facilitate the distribution of the data and code. As a case study we demonstrate the use of the toolkit on a national study of air pollution exposure and mortality.

Disciplines

Numerical Analysis and Computation

Suggested Citation

Peng, Roger and Eckel, Sandrah P., "DISTRIBUTED REPRODUCIBLE RESEARCH USING CACHED COMPUTATIONS" (June 2007). Johns Hopkins University, Dept. of Biostatistics Working Papers. Working Paper 147.
https://biostats.bepress.com/jhubiostat/paper147

Download

Included in

Numerical Analysis and Computation Commons

COinS

Collection of Biostatistics Research Archive

Johns Hopkins University, Dept. of Biostatistics Working Papers

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

JHU Biostatistics

Collection of Biostatistics Research Archive

Johns Hopkins University, Dept. of Biostatistics Working Papers

Title

Authors

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner

JHU Biostatistics