In constructing predictive models, investigators frequently assess the incremental value of a predictive marker by comparing the ROC curve generated from the predictive model including the new marker with the ROC curve from the model excluding the new marker. Many commentators have noticed empirically that a test of the two ROC areas often produces a non-significant result when a corresponding Wald test from the underlying regression model is significant. A recent article showed using simulations that the widely-used ROC area test [1] produces exceptionally conservative test size and extremely low power [2]. In this article we show why the ROC area test is invalid in this context. We demonstrate how a valid test of the ROC areas can be constructed that has comparable statistical properties to the Wald test. We conclude that using the Wald test to assess the incremental contribution of a marker remains the best strategy. We also examine the use of derived markers from non-nested models and the use of validation samples. We show that comparing ROC areas is invalid in these contexts as well.



Included in

Biostatistics Commons