"Comparison of Haplotype-based and Tree-based SNP Imputation in Associa" by James Y. Dai, Ingo Ruczinski et al.

UW Biostatistics Working Paper Series

Title

Comparison of Haplotype-based and Tree-based SNP Imputation in Association Studies

Authors

James Y. Dai, University of WashingtonFollow
Ingo Ruczinski, Johns Hopkins University Follow
Michael LeBlanc, Fred Hutchinson Cancer Research center Follow
Charles Kooperberg, Fred Hutchinson Cancer Research Center Follow

Comments

James Y. Dai and Ingo Ruczinski contributed equally to this work.

Abstract

Missing single nucleotide polymorphisms (SNPs) are quite common in genetic association studies. Subjects with missing SNPs are often discarded in analyses, which may seriously undermine the inference of SNP-disease association. In this article, we compare two haplotype-based imputation approaches and one regression tree-based imputation approach for association studies. The goal is to assess the imputation accuracy, and to evaluate the impact of imputation on parameter estimation. Haplotype-based approaches build on haplotype reconstruction by the expectation-maximization (EM) algorithm or a weighted EM (WEM) algorithm, depending on whether case-control status is taken into account. The tree-based approach uses a Gibbs sampler to iteratively sample from a full conditional distribution, which is obtained from the classification and regression tree (CART) algorithm. We employ a standard multiple imputation procedure to account for the uncertainty of imputation. We apply the methods to simulated data as well as a case-control study on developmental dyslexia. Our results suggest that imputation generally improves over the standard practice of ignoring missing data in terms of bias and efficiency. The haplotype-based approaches slightly outperform the tree-based approach when there are a small number of SNPs in linkage disequilibrium (LD), but the latter has a computational advantage. Finally, we demonstrate that utilizing the disease status in imputation helps to reduce the bias in the subsequent parameter estimation.

Disciplines

Biostatistics

Suggested Citation

Dai, James Y.; Ruczinski, Ingo; LeBlanc, Michael; and Kooperberg, Charles, "Comparison of Haplotype-based and Tree-based SNP Imputation in Association Studies" (January 2006). UW Biostatistics Working Paper Series. Working Paper 278.
https://biostats.bepress.com/uwbiostat/paper278

Download

Included in

Biostatistics Commons

COinS

Collection of Biostatistics Research Archive

UW Biostatistics Working Paper Series

Title

Authors

Comments

Abstract

Disciplines

Suggested Citation

Included in

Browse

Search

Author Corner

UW Biostatistics

Collection of Biostatistics Research Archive

UW Biostatistics Working Paper Series

Title

Authors

Comments

Abstract

Disciplines

Suggested Citation

Included in

Share

Browse

Search

Author Corner

UW Biostatistics