Background: Comparison of mRNA expression levels across biological samples is a widely used approach in genomics. Available data-analytic tools for deriving comprehensive lists of differentially expressed genes rely on data summaries formed using each gene in isolation from others. These approaches ignore biological relationships among genes and may miss important biological insight provided by genomics data.

Methods: We propose a fast, easily interpretable and scalable approach for identifying pairs of genes that are differentially expressed across phenotypes or experimental conditions. These are defined as pairs for which there is detectable phenotype discrimination using the joint distribution, but not from either of the the marginal distributions of two genes. Our approach is based on comparing the phenotype-specific gene correlation matrices to the overall gene correlation matrix.

Results: Application of our approach to two cancer datasets demonstrates that these experiments include gene pairs that show a detectable relationship with phenotype only when considered jointly. Also, the gene pairs identified by our method have a tendency to share biological relationships, as evidenced by further investigation of available information on gene function.

Conclusions: Important information on gene function, phenotype-related dependencies, and interactions among genes can be gleaned by systematic searches that compare the joint distributions of all possible gene pairs across conditions.



Included in

Genetics Commons