Genetic variations in colorectal cancer risk and clinical outcome
Colorectal cancer (CRC) has an apparent hereditary component, as evidenced by the well-characterized genetic syndromes and family history associated with the increased risk of this disease. However, in a large fraction of CRC cases, no known genetic syndrome or family history can be identified, suggesting the presence of “missing heritability” in CRC etiology. The genome-wide association study (GWAS) platform has led to the identification of multiple replicable common genetic variants associated with CRC risk. These newly discovered genetic variations might account for a portion of the missing heritability. Here, we summarize the recent GWASs related to newly identified genetic variants associated with CRC risk and clinical outcome. The findings from these studies suggest that there is a lack of understanding of the mechanism of many single nucleotide polymorphisms (SNPs) that are associated with CRC. In addition, the utility of SNPs as prognostic markers of CRC in clinical settings remains to be further assessed. Finally, the currently validated SNPs explain only a small fraction of total heritability in complex-trait diseases like CRC. Thus, the “missing heritability” still needs to be explored further. Future epidemiological and functional investigations of these variants will add to our understanding of CRC pathogenesis, and may ultimately lead to individualized strategies for prevention and treatment of CRC.

Key Words: Colorectal cancer; Genome-wide association study; Single nucleotide polymorphism; Signal transduction pathways; Cell cycle control; Gene desert; Genome instability

Core tip: This review covers the recent advances in genome-wide association studies (GWASs) that have identified genetic variants associated with an altered risk of colorectal cancer (CRC). In this review, we summarize single nucleotide polymorphisms (SNPs) located in or near genes that play crucial roles in signal transduction pathways, genome stability, cell cycle control, and gene expression and regulation. SNPs that are found in gene desert regions are also discussed. The relationship between genetic variations and clinical outcomes in CRC is presented from epidemiological studies that have identified SNPs with methods other than GWASs.


It is estimated that 35% of colorectal cancer (CRC) risk may be explained by heritable factors[1]. Heritable factors include well-characterized genetic syndromes inherited in a straightforward Mendelian manner, such as familial adenomatous polyposis and hereditary non-polyposis colorectal cancer, also known as Lynch Syndrome[2]. It is estimated that cumulatively, these and other well-characterized genetic syndromes with Mendelian mode of inheritance account for up to 10% of all CRC cases. In an estimated further 25% of cases, family history contributes to CRC risk in the absence of one of these identifiable genetic syndromes. The important role of family history in CRC risk is reflected in the guidelines published by the American College of Gastroenterology and the American Cancer Society, which recommend starting screening colonoscopies at an age cutoff that is a function of family history[3].

The combined effect of genetic syndromes and family history may explain up to 30% of CRC susceptibility, whereas the remaining genetic risk of CRC may be accounted for by a combination of high-prevalence and low-penetrance of common genetic variants. Recent advances using genome-wide association study (GWAS) have enabled the identification of multiple CRC-related single nucleotide polymorphisms (SNPs)[4-10]. These genetic variants can be broadly classified into two categories: those that affect the risk of developing CRC, and those that influence the clinical course of CRC once established. In this review, we summarize these GWAS-identified genetic variants - including functional characterizations and implications for clinical applications - and discuss some of the limitations and challenges of these studies.


By comparing the distributions of millions of tagged SNPs between CRC patients (cases) and cancer-free populations (controls), a large number of common genetic variants have been identified under the “common disease-common variant” premise. To date, more than 40 chromosome regions harboring common variants conferring altered CRC risk have been identified by the GWAS approach. These variants are dispersed amongst almost every human chromosome and the vast majority of them exhibit a small effect size (Table 1). Most of these loci confer a modest increase in CRC risk, typically with an OR of less than 1.20. Among the 48 SNPs listed in Table 1, eight had an OR of more than 1.20, of which only three exhibited an OR of more than 1.30. A higher effect size (OR = 2.64) was reported for rs6038071 located upstream of the CSNK2A1 gene and validated in familial CRC populations, although only under a recessive genetic model with least statistical power[11]. The majority of GWAS-identified CRC risk variants are involved in known biological pathways; however, a few highly significant ones reside in gene desert regions, and the mechanism by which these variants contribute to colorectal carcinogenesis remains unclear. Here, we summarize these variants in relation to their implications in pathways of signal transduction, genome instability, cell cycle control, and gene expression and regulation (Table 1). These pathways and related significant SNPs identified by GWASs are also depicted in Figure 1. This figure was produced by combining pathways from various studies[12-15].

Table 1 Genome-wide association study-identified common genetic variants associated with colorectal cancer risk.
SNPLociGeneFull name of geneORP valuePathway/functionRef.Method
Common biological pathway-related
rs127019377p14.1GLI3 and INHBAGLI family zinc finger 3 and inhibin, beta A1.363.50E-05MAPK signaling pathways[11]G
rs1101499310p12.1MYO3AMyosin IIIA1.222.00E-03MAPK signaling pathways[11]G
rs5933612q24.21TBX3T-box31.092.46E-06Wnt pathway[10]G + M
rs444423514q22.2BMP4Bone morphogenetic protein 41.091.95E-11BMP pathway[21]G
rs195763614q22.2BMP4Bone morphogenetic protein 41.081.36E-09BMP pathway[21]G
rs1696968115q13.3GREM1DAN family BMP antagonist-5.33E-08BMP pathway[21]G
rs477958415q13.3GREM1DAN family BMP antagonist-5.27E-03BMP pathway[21]G
rs1163271515q13.3GREM1DAN family BMP antagonist-2.30E-10BMP pathway[21]G
rs493982718q21SMAD7SMAD family member 71.27.80E-28TGF-β1 pathway, cell arrest, cell proliferation[5,6,19]G
rs1295371718q21SMAD7SMAD family member 71.179.10E-12TGF-β and Wnt signaling[5]G
rs446414818q21SMAD7SMAD family member 71.156.66E-08TGF-β and Wnt signaling[5]G
rs96123520p12.3BMP2Bone morphogenetic protein 21.124.45E-16BMP pathway[21]G
rs481380220p12.3BMP2Bone morphogenetic protein 21.097.52E-11BMP pathway[21]G
rs603807120p13CSNK2A1Casein kinase 2, alpha 1 polypeptide2.643.00E-04MAPK signaling pathways[11]G
Genome instability-related
rs119037572q32.3NABP1Nucleic acid binding protein 11.169.50E-08DNA repair, genomic stability[10]G + M
rs6471615q31.1PITX1Paired-like homeodomain transcription factor 11.111.22E-10RAS pathway; activate TP53; telomerase activity[29]G + M
rs13213116p21CDKN1ACyclin-dependent kinase inhibitor 1A1.11.14E-10Microsatellite instability, DNA repair, genomic instability[26]G + M
rs382499911q13.4POLD3Polymerase DNA- directed δ31.083.65E-10DNA mismatch and base-excision repair[26]G + M
rs7837822217p13TP53Promotor region of TP53 gene1.391.60E-04TP53[63]G
Cell cycle control-related
rs109112511q25.3LAMC1Laminin gamma 11.095.90E-08Gene transcription[10]G + M
rs66911701q41DUSP10Dual-specificity phosphatase1.069.55E-10Inactivates p38 and SAPK[9]M
rs66877581q41DUSP10Dual-specificity phosphatase1.092.27E-09Inactivates p38 and SAPK[9]M
rs8867747q31LAMB1Laminin β11.173.00E-08Anchoring the single-layered epithelium, ulcerative colitis[33]G
rs380284211q23POU2AF1POU class 2 associating factor 11.15.80E-10Growth of multiple myeloma cells[6]G
rs1077421412p13.32CCND2Cyclin D21.093.06E-08Cell-cycle transition[29]G + M
rs321781012p13.32CCND2Cyclin D21.23.70E-07Cell-cycle transition[10]G + M
rs321790112p13.32CCND2Cyclin D21.1< 5.0E-7Cell-cycle transition[10]G + M
rs1116955212q13.13DIP2Disco-interacting protein 2B1.091.89E-10Cell morphogenesis[9]M
rs172878516q22CDH1E-cadherin,1.172.80E-08Epithelial restitution, repair following mucosal damage, active colitis[33]G
rs1041121019q13.33RHPN2Rho GTPase binding protein 21.155.00E-09Actin cytoskeleton[20]G + M
rs492538620q13.33LAMA5Large laminin A51.081.89E-10BMP pathway[9]M
rs5934683Xp22.2SHR00M2Shroom family member 21.077.30E-10Cell morphogenesis[26]G + M
Gene expression and regulation-related
rs168927668q23.3EIF3HEukaryotic translation initiation factor 3, subunit H1.253.30E-18Translation initiation[8]G
rs70143488q24POU5FIP1POU class 5 homeobox 1B1.198.60E-26Weak transcriptional activator[6]G
rs713670212q13.13ATF1Activating transcription factor 11.064.02E-08Transcription[9]M
rs601734220q13.12HNF4ATranscription factor hepatocyte nuclear factor 4α1.113.20E-17Transcription[33]G
Gene desert and others
rs45741182q12PLGLAPlasminogen-like A, non-coding RNA-1.80E-07-[11]G
rs109365993q26.2MYNNMyoneurn gene1.083.39E-08Unknown[9]M
rs41409044p15.3NCAPCNon-SMC condensing I complex, subunit G-1.40E-07-[11]G
rs77582296q26-27SLC22A3organic cation transporter1.287.92E-09Transport of cationic drugs, toxins, and endogenous metabolism[19]G
rs22099079q21.3TLE4Transducin-like enhancer of spit 4-3.40E-08-[11]G
rs242327920p12.3PLCB1Phospholipase C-beta 11.16.64E-09Unknown[29]G + M
Figure 1
Figure 1 Major pathways with significant genetic variants implicated in the development of colorectal cancer. Several pathways and related genes involved in the progression of colorectal cancer are illustrated. Genes with significant single nucleotide polymorphisms that are associated with colorectal cancer risk are represented with gray color. TGF-β: Transforming growth factor-β; BMP: Bone morphogenetic protein; CIMP: CpG island methylator phenotype; MSI: Microsatellite instability; CIN: Chromosomal instability.
Genetic variants in signal transduction pathways

CRC GWASs have identified significant variants in signal transduction pathways such as those mediated by WNT/β-catenin, transforming growth factor (TGF)-β/bone morphogenetic protein (BMP), and mitogen-activated protein kinase (MAPK). Somatic mutations in the WNT/β-catenin signaling pathway were discovered in more than 95% of CRC patient tissues[16], suggesting abnormalities of genes in this pathway may play an important role in colorectal carcinogenesis. The risk allele rs59336 located in the intron of TBX3 gene, a downstream target of WNT/β-catenin pathway, has been associated with a significantly higher risk of developing CRC[8]. Changes in β-catenin and SMAD7 expression can influence WNT/β-catenin pathway signaling[17]. Moreover, perturbation of SMAD7 expression has been documented to affect CRC progression[18]. Three genetic variants of SMAD7 in chromosome 8q21 - rs4939827, rs12953717 and rs4464148 - confer an increased CRC risk[5]. These findings and other WNT/β-catenin variants were further independently identified and validated[6,19]. BMPs are closely related to signal transductions mediated by TGF-β. Two independent GWASs[9,20] identified 14 CRC risk loci, of which three were adjacent to genes involved in BMP-mediated signaling transduction, including rs4444235 on BMP4, rs961253 on BMP2, and rs4779584 on DNA family BMP antagonist GREM1. BMP-related variants were further confirmed in another independent CRC population[21]. The MAPK-mediated signaling pathway is known to be crucial for several cellular mechanisms such as cell proliferation, survival, and resistance to apoptosis. A GWAS using German familial CRC patients[9] observed that CRC risk increases significantly with an increase in the number of risk alleles in seven genes involved in MAPK signaling. The molecular basis of these observed associations remains undetermined.

Genetic variants related to genome instability

Genome instability is known to be both a contributor to, as well as a consequence of, colorectal carcinogenesis. There are several major genomic instability-related mechanisms in colorectal carcinogenesis, such as chromosomal instability, microsatellite instability, and CpG island methylator phenotypes[22]. Several loci involving these mechanisms were identified recently by GWASs. For example, Peters et al[10] identified rs11903757, a significant SNP in an intergenic locus on chromosome 2q32.3, proximal to NABP1, which encodes human single-stranded DNA binding protein 2 and plays a role in a diverse array of cellular processes such as DNA replication, recombination, transcription, and maintenance of genomic stability[23-25]. Another variant, rs1321311, is in linkage disequilibrium with a region that encompasses the CDKN1A gene[26], which encodes the p21 protein that mediates p53-dependent growth arrest, and affects multiple tumor suppressor pathways. The p21 protein also interferes with the activity of proliferating cell nuclear antigen (PCNA)-dependent DNA polymerase, thereby regulating DNA replication and repair. It has been demonstrated that down-regulation of p21 inversely correlates with microsatellite instability status[27,28]. Two additional CRC risk variants - rs248999 and rs647161 - could also potentially interact with p21[26,29]. Other genome instability related SNPs include rs248999, located in an intron of the POLD3 gene which encodes a component of the DNA polymerase-δ complex of PCNA, and rs647161 in a putative tumor suppressor homeodomain 1 gene PITX1, which has been reported to encode a protein that activates p53 protein and maintains genome stability[30,31].

Genetic variants related to cell cycle control

Genetic pathways mediating cell-cycle control are commonly implicated in colorectal carcinogenesis. Polymorphisms of several cell cycle-related genes have been reported to be associated with CRC risk in recent GWASs, including two independent SNPs (rs3217810 and rs3217901) located in the introns of CCND2. Jia et al[29] identified another SNP, rs10774214, located in 12q13.32, proximal to CCND2 in Asian populations. CCND2 encodes cyclin D2, a member of the D-type cyclin family which plays a critical role in cell cycle control, specifically at the G1/S boundary by activating cyclin-dependent kinases (CDKs), primarily CDK4 and CDK6[32]. Two significant SNPs, rs7136702 and rs11169552, lie about 275 kb apart within a large poorly-defined haplotype block covering the DIP2 gene, which encodes a protein with putative role in epithelial cell fate determination[9]. Another SNP, rs10911251, is proximal to the promoter of encoding laminin gamma 1 (LAMC1) and confers a significantly increased CRC risk by virtue of influencing LAMC1 gene expression[10]. SNPs in two additional laminin genes (laminin beta1 in 7q31 and laminin alpha 5 in 20q13) were also identified in recent CRC GWASs[9,21,33]. Laminins are known to be involved in a variety of cellular mechanisms such as regulation of cell adhesion, differentiation, and migration[34,35]. Another important cell cycle-related SNP was reported by Dunlop et al[26] using five GWAS datasets. This SNP, rs5934683, is on chromosome Xp22.2 and proximal to encoding shroom family member 2, a human homolog of the Xenopus laevis APX gene that is known to have broad functions in cell morphogenesis during endothelial and epithelial tissue development[36]. Missense mutations in this gene have been detected in a large-scale screening for recurrent mutations in cancer cell lines[37]. The relationship between Xp22.2 and CRC risk represents the first evidence for the role of X-chromosome variation in the predisposition to a non-sex-specific cancer.

Genetic variants related to gene expression and regulation

Thousands of transcription factors, cofactors, and chromatin regulators establish gene expression patterns and maintain specific cell stages in humans. Barrett et al[33] identified a significant association between CRC risk and SNP rs6017342 which maps to a recombination hot spot on chromosome 20q13 containing the 3’-untranslated region of the HNF4A gene. HNF4A encodes the transcription factor hepatocyte nuclear factor 4α, which regulates the expression of multiple organ development-related genes. In addition, HNF4A has also shown to interact with β-catenin to regulate cell-cell adhesion and gene transcription[38]. Another significant SNP, rs11169552, is close to activating transcription factor 1 (ATF1)[9], which belongs to the ATF subfamily and basic-region leucine zipper family. The protein product of ATF1 influences cellular processes by regulating the expression of many downstream target genes involved in cellular growth and survival. Previous studies have demonstrated that ATF1 protein interacts with EWSR1 to form a unique chimeric fusion protein complex which is important in the development of clear cell sarcoma[39,40]; however, its role in colorectal carcinogenesis remains to be established. Moreover, ATF1 may also form cyclin-dependent kinase 3-mdiated activating transcription factor 1 complex that is critical in cellular proliferation and malignant transformation[41].

Genetic variants in the gene desert regions and others

Although the majority CRC risk variants are related to well-established biological pathways, the functions of some reported loci remain elusive. Various independent studies have reported that multiple SNPs in chromosome region 8q24 are associated with altered risk of several solid tumor malignancies, including CRC. Three SNPs in this region, namely rs7014348, rs6983267, and rs7837328, have been significantly associated with CRC risk in recent GWASs[4,6,19,42]. In addition, variants in the 8q24 region have also been associated with cancers of breast, prostate, ovarian, bladder, pancreas, and brain[6,42-49]. Nonetheless, majority of the significant SNPs identified in this region are not located in, or close to, any well annotated genes because the 8q24 region is largely a gene desert. Therefore, details of the molecular mechanisms underlying the observed effect of these SNPs remain largely unknown. It has been speculated that these SNPs may function through their long-range linkage with causal variants within other oncogenes or tumor suppressor genes. Others have conjectured that some SNPs may influence gene expression through long-range cis-regulatory elements. Wasserman et al[50] used an in vivo bacterial artificial chromosome enhancer-trapping strategy to scan the 8q24 gene desert region and found that a highly significant CRC risk variant, rs6983267, resides within an in vivo prostate enhancer whose expression mimics that of the nearby MYC oncogene[51]. Another discovery illustrated a gene encoding a novel non-coding RNA, CCAT2, also mapped to the 8q24 gene desert region. Encompassing the rs6983267 SNP, this long non-coding RNA transcript is highly overexpressed in microsatellite-stable CRC, promoting tumor growth, metastasis, and chromosomal instability[52]. Another 8q24 locus, rs7014346, in high linkage disequilibrium with rs6983267, resides within 3 kb upstream of POU5F1P1, a pseudogene of POU5F1 that encodes an important stem cell-related protein regulating cellular pluripotency and self-renewal[53]. However, no functional implication of this SNP has been reported and it remains to be assessed whether it influences the development of CRC stem cells, a suspected small portion of cancer cells that are responsible for tumor progression and drug resistance[54]. In all, the identification of the large number of bona-fide risk variants in gene desert regions indicates that candidate-gene and pathway-based strategies may not be adequate to capture and understand the complete spectrum of common risk variants of CRC. Unbiased genome-wide interrogation in adequately powered studies, combined with meta-analysis and functional characterization is more likely to help us understand how common genetic variations play a role in CRC carcinogenesis.


There have been reports of genetic variants associated with the clinical outcome of CRC patients which can be used to categorize patients with different survival patterns or responses to specific treatments. However, the majority of reported outcome-related SNPs are generated from candidate gene or pathway-based studies. As of yet, no GWAS has been reported to examine a direct relationship between genetic variations and CRC clinical outcome. The findings of some of recently published studies are summarized in Table 2.

Table 2 Common genetic variants associated with colorectal cancer clinical outcome.
Genes/lociSNP1Patient populationClinical outcomeHR (95%CI)P valueRef.
MTHFRglu429alaMixed colorectal cancer (CRC) patientsOS1.71 (1.18-2.49)0.005[64]
ESR2rs2987983Postmenopausal women with CRCOS0.77 (0.60-0.99)0.002[65]
SCN1Ars3812718Stage II/III patients with adjuvant 5-fuorouracil (5-FU) based chemotherapyTTR2.26 (0.89-5.70)0.039[66]
SMAD7rs4939827Mixed CRC patientsOS1.16 (1.06-1.27)0.002[55]
mir608rs4919510Stage III patients with 5-FU based chemotherapyRE1.65 (1.13-2.41)0.01[67]
rs4919510OS1.96 (1.19-3.21)0.008
15q13.3rs10318Stage II patients with 5-FU based adjuvant chemotherapyER2.98 (1.27-6.99)0.012[56]
11q23.1rs10749971Stage III patients with 5-FU based adjuvant chemotherapyER0.46 (0.27-0.80)0.006
20p12.3rs961253ER0.46 (0.22-0.96)0.038
OS0.24 (0.09-0.68)0.007
20p12.3rs355527ER0.48 (0.23-0.99)0.048
OS0.29 (0.10-0.81)0.019
18q21.1rs4464148OS4.34 (1.46-12.89)0.008
8q24.21rs6983267OS4.20 (1.13-15.64)0.032
rs10505477OS4.20 (1.13-15.64)0.032
15q13rs4779584Chinese CRC patientsOS0.33 (0.15-0.72)0.007[57]
10p14rs10795668RE0.55 (0.30-1.00)0.05
pre-mi-423rs6505162Mixed CRC patientsOS2.12 (1.34-3.34)0.001[68]
rs6505162RFS1.59 (1.08-2.36)0.019
pre-mi-608rs4919510RFS0.61 (0.41-0.92)0.017
CLOCKrs3749474Resected CRC patientsOS0.55 (0.37-0.81)0.003[69]
rs1801260OS0.31 (0.11-0.88)0.03
SCDrs7849Stage II patients with 5-FU based adjuvant chemotherapyRE2.89 (1.54-5.41)0.001[70]
VEGF-2578Stage IITTR2.01 (1.13-3.56)0.02[71]
-460TTR0.50 (0.29-0.89)0.02
KDRrs10013228Resected CRC patientsRE0.53 (0.30-0.95)0.032[72]
CD44rs8193Stage III and high risk stage II patients with 5-FU based chemotherapyTTR0.51 (0.35-0.93)0.022[73]
ALCAMrs1157TTR0.56 (0.33-0.94)0.024
LGR5rs17109924TTR0.33 (0.12-0.90)0.023

Three recent studies have examined the relationship between GWAS-identified CRC risk variants and the clinical outcome of the disease[55-57]. Based on the data from five GWAS populations of 2611 CRC patients, Phipps et al[55] assessed 16 SNPs and found rs4939827, a SNP in the SMAD7 gene, to be significantly associated with reduced overall survival of patients (HR = 1.16, P = 0.002) and disease-specific survival (HR = 1.17, P = 0.005). Dai et al[56] used a Caucasian population of 285 stage II or III CRC patients receiving fluorouracil-based chemotherapy to evaluate 26 CRC risk variants derived from 10 GWAS-identified chromosome loci. Although no SNP was found to be associated with the survival of all patients, they found that different SNPs might be associated with the clinical outcome of patients in specific stages. In another study of 380 Chinese CRC patients, Xing et al[57] reported two GWAS-identified CRC risk variants - rs4779584 on chromosome 15q13 and rs10795668 on 10p14 - were associated with reduced risk of both death and recurrence. Moreover, stratified analysis indicated that the beneficial effect of chemotherapy in this patient cohort was evident only in patients with the variant rs10795668, but not in those with the wild-type genotype of this SNP. This indicates that rs10795668 may potentially be useful in selecting patients for chemotherapy treatments. Taken together, these findings suggest that genetic variants associated with CRC risk may also predict the clinical outcome of CRC patients. However, these studies are limited by small sample size and heterogeneous patient population and treatments. Therefore, their findings need to be interpreted with caution and warrant further validations.

In addition to those GWAS-identified CRC risk loci, other epidemiological studies have also identified genetic variations associated with clinical outcome of CRC (Table 2). All of these studies are based on candidate gene or pathway-based approaches instead of GWAS. This is largely because compared to case-control studies, clinical outcome studies are generally based on cancer patients with highly heterogeneous characteristics and treatments that confound the very modest effect of genetic variants on patient outcomes. This issue could be partly resolved by the use of clinical trial patients that have more homogeneous characteristics and treatments, or consortium studies with much larger number of patients.


Findings from the first wave of GWASs seem to promise greater understanding of the genetic component of CRC pathogenesis on a molecular level. However, there are several major limitations in current GWAS approaches which may also pose challenges for future studies. First, the vast majority of currently identified SNPs lack known functional significance. Thus, whether they are causal variants or just surrogates that are in linkage disequilibrium with the functional loci remains largely unknown. Therefore, a major task ahead is to conduct fine-mapping in the immediate regions surrounding these loci, and narrow down the regions of association to pinpoint the causal variants[58]. Second, although the statistical significance of most GWAS-identified SNPs is high, the utility of these bona-fide variants in a clinical setting to predict the risk of developing cancer remains to be assessed. This is largely due to the modest effect size associated with most of the specific individual variants. Wacholder et al[59] reported a very modest increase in the power to predict breast cancer risk by adding 10 highly significant GWAS-identified breast cancer risk variants to the commonly recognized self-reported risk factors. Moreover, they found that the level of predicted breast cancer risk among most women barely changes by the addition of the GWAS-generated genetic information. Similarly, Park et al[60] reported that the combined use of all current genetic information derived from GWASs only has modest discriminative power (about 63.5% area under curve) in breast, prostate, and colorectal cancers. Therefore, further identification of additional low-penetrance common variants, especially the causal variants, is necessary to improve the clinical utility of GWAS-generated genetic information. Third, it is estimated that the currently validated SNPs in aggregate still explain only a small fraction of total heritability in most complex-trait diseases[61]. Several possible reasons may further account for this “missing heritability”. These include unidentified common variants, the unexplored territory of rare genetic variants that have high-penetrance but low-prevalence, and the largely un-assessed gene-gene and gene-environment interactions[21,62]. All of these issues could be partially addressed by increasing population size and thus statistical power. Thus, meta-analysis and combined analysis of multiple study populations are effective means to tackle these issues in the near future. In addition, using novel technologies such as the next generation sequencing to identify rare causal variants may also help address the missing heritability.

Despite these limitations, the identification of a host of specific genetic variants associated with elevated CRC risk through the GWAS approach does suggest the possibility of tailoring colorectal screening strategies such as age at first colonoscopy, and interval between surveillance colonoscopies. By better appreciating the mechanism by which these genetic variants alter CRC risk, morbidity and mortality could be reduced in higher risk sub-groups by more aggressive surveillance and cost could be reduced in low-risk groups requiring less intensive testing.


P- Reviewers: Camacho J, Tong WD, Yu B S- Editor: Gou SX L- Editor: A E- Editor: Ma S

