A two-channel microarray measures the relative expression levels of thousands of genes from a pair of biological samples. In order to reliably compare gene expression levels between and within arrays, it is necessary to remove systematic errors that distort the biological signal of interest. The standard for accomplishing this is smoothing "MA-plots" to remove intensity-dependent dye bias and array-specific effects. However, MA methods require strong assumptions. We review these assumptions and derive several practical scenarios in which they fail. The "dye-swap" normalization method has been much less frequently used because it requires two arrays per pair of samples. We show that a dye-swap is accurate under general assumptions, even under intensity-dependent dye bias, and that a dye-swap provides the minimal information required for removing dye bias from a pair of samples in general. Based on a flexible model of the relationship between mRNA amount and single channel fluorescence intensity, we demonstrate the general applicability of a dye-swap approach. We then propose a common array dye-swap (CADS) method for the normalization of two-channel microarrays. We show that CADS removes both dye-bias and array-specific effects, and preserves the true differential expression signal for every gene. Finally, we discuss some possible extensions of CADS that circumvent the need to use two arrays per pair of samples.


Bioinformatics | Computational Biology | Microarrays | Statistical Methodology | Statistical Theory