An important step in building a multiple regression model is the selection of predictors. In genomic and epidemiologic studies, datasets with a small sample size and a large number of predictors are common. In such settings, most standard methods for identifying a good subset of predictors are unstable. Furthermore, there is an increasing emphasis towards identification of interactions, which has not been studied much in the statistical literature. We propose a method, called BSI (Bayesian Selection of Interactions), for selecting predictors in a regression setting when the number of predictors is considerably larger than the sample size with a focus towards selecting interactions. Latent variables are used to infer subset choices based on the posterior distribution. Inference about interactions is implemented by a constraint on the latent variables. The posterior distribution is computed using the Gibbs Sampling methods. The finite-sample properties of the proposed method are assessed by simulation studies. We illustrate the BSI method by analyzing data from a hypertension study involving Single Nucleotide Polymorphisms (SNPs).
Genetics | Numerical Analysis and Computation | Statistical Models
Chen, Wei; Ghosh, Debashis; Raghuanthan, Trivellore E.; and Kardia, Sharon, "A Bayesian method for finding interactions in genomic studies" (November 2004). The University of Michigan Department of Biostatistics Working Paper Series. Working Paper 48.
Genetics Commons, Numerical Analysis and Computation Commons, Statistical Models Commons