The focus of this article is on the design and analysis of mRNA-Seq experiments, with the aim of inferring transcript levels and identifying differentially expressed genes. We investigate two mRNA-Seq datasets obtained using Illumina's Genome Analyzer platform to measure transcript levels in reference samples considered in the MicroArray Quality Control (MAQC) Project. We address the following four main issues: (1) exploratory data analysis for mapped reads, relating read counts to variables describing input samples and genomic regions of interest; (2) assessment and quantitation of biological effects (e.g., expression levels in Brain vs. UHR) and nuisance experimental effects (e.g., library preparation, flow-cell, and lane effects); (3) evaluation and comparison of methods for the identification of differentially expressed genes; (4) impact of base-calling calibration method (phi X vs. auto-calibration).


Bioinformatics | Computational Biology