Motivation: MicroRNAs (miRNAs) are short single-stranded non-coding molecules that usually function as negative regulators to silence or suppress gene expression. Due to interested in the dynamic nature of the miRNA and reduced microarray and sequencing costs, a growing number of researchers are now measuring high-dimensional miRNAs expression data using repeated or multiple measures in which each individual has more than one sample collected and measured over time. However, the commonly used site-by-site multiple testing may impair the value of repeated or multiple measures data by ignoring the inherent dependent structure, which lead to problems including underpowered results after multiple comparison correction using false discovery rate (FDR) estimation and less biologically meaningful results. Hence, new methods are needed to tackle these issues.

Results: We propose a penalized regression model incorporating grid search method (PGS), for analyzing association study of high-dimensional microRNA expression data with repeated measures. The development of this analytical framework was motivated by a real-world miRNA dataset. Comparisons between PGS and the site-by-site testing revealed that PGS provided smaller phenotype prediction errors and higher enrichment of phenotype-related biological pathways than the site-by-site testing. Simulation study showed that PGS provided more accurate estimates and higher sensitivity than site-by-site testing with comparable specificities.

Availability: R source code for PGS algorithm, implementation example, and simulation study are available for download at https://github.com/feizhe/PGS.



Included in

Biostatistics Commons