World J Gastroenterol. Nov 21, 2009; 15(43): 5377-5396
Published online Nov 21, 2009. doi: 10.3748/wjg.15.5377
Table 1 Phases in the initiation and analysis of a genome-wide association study
Sample panel building
Cases and healthy controls of same ethnicity (for power estimates see figure 2)
Enrichment with early onset cases and/or familial cases
Keep variability in phenotype at a minimum
Establish replication cohort(s) after the same principles. Other, yet similar, ethnicities may be included, although matched healthy controls should be collected
Sample preparation (DNA extraction, calibration)
Genotyping chip (cost vs number of samples)
Genetic coverage
Initial quality control
Exclude samples failing platform-specific QC measures
Exclude samples with low call-rate
Exclude SNPs with a low genotyping rate
Exclude SNPs with a low minor allele frequency and those grossly out of Hardy-Weinberg equilibrium (e.g. P < 10-4)
Statistical analysis
Imputation of non-genotyped SNPs using HapMap as the reference
Single-point association analysis, if needed include covariates of interest in the present study (e.g. gender, sex, smoking, imputation uncertainties etc.)
Manually inspect cluster plots for highly significant SNPs that should be followed-up
Select 1-2 SNPs from each associated locus to take forward in replication
Genotype (preferentially independent technology) in a panel of cases and healthy controls that are properly sized to detect effects in the same range as seen in the discovery panel
Follow-up experiments
Highly depends on results, i.e. nature of genetic finding, and normally not part of the GWAS design