ABSTRACT The traditional method for creating a gene score to predict a given outcome is to use the most statistically significant single nucleotide polymorphisms (SNPs) from all SNPs which were tested. There are several disadvantages of this approach such as excluding SNPs that do not have strong single effects when tested on their own but do have strong joint effects when tested together with other SNPs. The interpretation of results from the traditional gene score may lack biological insight since the functional unit of interest is often the gene, not the single SNP. In this paper we present a new gene scoring method, which overcomes these problems as it generates a gene score for each gene, and the total gene score for all the genes available. First, we calculate a gene score for each gene and second, we test the association between this gene score and the outcome of interest (i.e. trait). Only the gene scores which are significantly associated with the outcome after multiple testing correction for the number of gene tests (not SNPs) are considered in the total gene score calculation. This method controls false positive results caused by multiple tests within genes and between genes separately, and has the advantage of identifying multi-locus genetic effects, compared with the Bonferroni correction, false discovery rate (FDR), and permutation tests for all SNPs. Another main feature of this method is that we select the SNPs, which have different effects within a gene by using adjustment in multiple regressions and then combine the information from the selected SNPs within a gene to create a gene score. A simulation study has been conducted to evaluate finite sample performance of the proposed method.
Cite this paper
nullC. Xie, "A Gene Score Test for Disease Association with Multiple Genes," Open Journal of Statistics, Vol. 1 No. 1, 2011, pp. 15-18. doi: 10.4236/ojs.2011.11002.
 T. Wang and R. C. Elston, “Improved power by use of a weight score test for linkage disequilibrium mapping,” The American Journal of Human Genetics, Vol. 80, 2007, pp. 353-360.
 J. M. Chapman and J. Whittaker, “Analysis of multiple SNPs in a candidate gene or region,” Genetic Epidemiology, Vol. 32, 2008, pp. 560-566.
 W. Pan, “Asymptotic tests of association with multiple SNPs in linkage disequilibrium,” Genetic Epidemiology, Vol. 33, 2009, pp. 497-507.
 X. Gu, R. F. Frankowski, G. L. Rosner, M. Relling, B. Peng and C. I. Amos, “A modified forward multiple regression in high-density genome-wide association studies for complex traits,” Genetic Epidemiology, Vol. 33, 2009, pp. 518-525.
 S. R. Seaman and B. Muller-Myhsok, “Rapid simulation of P values for product methods and multiple-testing adjustment in association studies,” The American Journal of Human Genetics, Vol. 76, 2005, pp. 399-408.
 P. McCullagh and J. A. Nelder, “Generalized Linear Models,” London: Chapman & Hall, 1983.
 S. S. Anand, C. Xie, G. Pare, A. Montpetit, S. Rangarajan, M. J. McQueen, H. J. Cordell, B. Keavney, S. Yusuf, T. J. Hudson and J. C. Engert, “Genetic variants associated with myocardial infarction risk factors in over 8000 individuals from five ethnic groups: the INTERHEART genetic study,” Circulation Cardiovascular Genetics, Vol. 2, No. 1, 2009, pp. 16-25.