The predictive capacity of a marker in a population can be described using the population distribution of risk (Huang et al., 2007; Pepe et al., 2008a; Stern, 2008). Virtually all standard statistical summaries of predictability and discrimination can be derived from it (Gail and Pfeiffer, 2005). The goal of this paper is to develop methods for making inference about risk prediction markers using summary measures derived from the risk distribution. We describe some new clinically motivated summary measures and give new interpretations to some existing statistical measures. Methods for estimating these summary measures are described along with distribution theory that facilitates construction of confidence intervals from data. We show how markers and, more generally, how risk prediction models, can be compared using clinically relevant measures of predictability. The methods are illustrated by application to markers of lung function and nutritional status for predicting subsequent onset of major pulmonary infection in children suffering from cystic fibrosis. Simulation studies show that methods for inference are valid for use in practice.


Statistical Methodology | Statistical Theory