Published 2006, The American Journal of Human Genetics, Vol. 78, April 2006, pp. 615-628.


High-throughput genotyping technologies for single nucleotide polymorphisms (SNP) have enabled the recent completion of the International HapMap Project (Phase I), which has stimulated much interest in studying genome-wide linkage disequilibrium (LD) patterns. Conventional LD measures, such as D' and r-square, are two-point measurements, and their relationship with physical distance is highly noisy. We propose a new LD measure, defined in terms of the correlation coefficient for shared haplotype lengths around two loci, thereby borrowing information from multiple loci. A U-statistic-based estimator of the new LD measure, which takes into consideration the dependence structure of the observed data, is developed and compared to a naive estimator based on the usual empirical correlation coefficient. Furthermore, we propose methods for inferring LD decay rates based on the new LD measure. The results from coalescent simulation studies and analysis of HapMap SNP data demonstrate that the proposed new LD measure and its estimators are superior to the two most popular conventional LD measures, in terms of their relationship with physical distance and recombination rate, their small variability, and their strong robustness to marker allele frequencies. These merits may offer new opportunities for mapping complex disease genes and investigating recombination mechanisms based on better-quantified LD.


Statistical Methodology | Statistical Models | Statistical Theory