INTRODUCTION
Chronic idiopathic inflammatory bowel diseases (IBD), made up predominantly of Crohn’s disease (CD) and ulcerative colitis, remain fertile ground for both clinical and basic science research. These conditions have been held up as classic examples of complex polygenic disorders with the working hypothesis of a genetically susceptible host manifesting the disease phenotype after exposure to a single or a series of environmental triggers. Study of the genetics of IBD has expanded rapidly over the past 10 years since publication of the first genome-wide scan in 1996[1]. Technical advances have aided this advance with the culmination being the discovery of the NOD2/CARD15 gene within IBD1 on chromosome 16[2,3]. The case for further identification of IBD susceptibility genes has been put forward, specifically regarding the organic cationic transporters, OCTN1 and OCTN2, on chromosome 5 (IBD5) and DLG5 on chromosome 10[4,5]. These genes have not received universal support from independent studies and remain topics for discussion and research, specifically including further functional studies.
These rapid advances in IBD genetics, fuelled by the interest and enthusiasm of both clinicians and geneticists, has set the pace of IBD research over the past 8-10 years, which has succeeded the “IBD immunology” era of the 1990’s. While the latter gave us a plethora of cytokine data, including the important role of TNF-α leading to anti-TNF strategies in CD, the discovery of NOD2 highlighted the significance of the gut’s innate immune system including pathogen-associated molecular patterns (PAMPs)[6]. This progress has been particularly challenging for clinician researchers as they are expected to translate these critical discoveries into meaningful changes and advances in clinical practice, including assessment and treatment of IBD patients. In this review, we will assess the clinical interpretations of major breakthroughs in CD genetics focusing on the NOD2/CARD15 gene, including both the strengths and limitations of such studies and an analysis of significant differences in phenotypes and genotypes, and we will offer suggestions for future studies[7,8].
CLASSIFICATION OF PHENOTYPE
Clinical classifications of CD seem to be used synonymously with terms for phenotypes, and a number of papers have described “genotype-phenotype correlations”. However, genotypes and phenotypes are derived in fundamentally different ways. While genotype is based upon a limited number of options that can be determined mechanically and thus objectively, derivation of phenotype is subjective and influenced by potentially multiple variables, with many yet to be identified. A number of variables may not be included in multivariate analysis when genotype-phenotype analyses are performed.
The clinical classification systems that have been put forward have been purposefully kept simple. This may have been done to encourage individuals and groups to adopt them and hence maximize opportunities to compare and combine datasets[9-11]. However, given this intention, it does not appear to have been successful for genotype-phenotype studies in CD. The updated Vienna classification for Crohn’s disease, now known as the Montreal classification, has maintained a simple approach including age at diagnosis, location, and behavior as the major parameters, with perianal disease as a subgroup of behavior[11]. It has appropriately split age at diagnosis and perianal penetrating from internal penetrating, but has not clarified the issue of stricturing disease working independently of penetrating disease[12]. There is no mention of the need to demonstrate the presence of transmural inflammation, assuming that this is a prerequisite for the diagnosis of CD. This is relevant to subsequent study of those patients with inflammatory CD who do not evolve into a stricturing and/or penetrating pattern. Do these patients fit into the definition of CD, given that they do not appear to show any association with NOD2/CARD15 after correction for disease duration?[13] Other variables that may well influence natural history, including smoking, prior appendectomy, and body mass index, have thus far not been included in this classification[14-16]. The latter may influence development of penetrating disease, the level of inflammation and hence CRP and the risk of metabolic bone disease[17,18].
Clinical classification systems for ulcerative colitis are few. There was no such system included in the Vienna initiative. The Montreal paper presented a simple classification based upon distribution and activity, but both of these parameters are subjective, particularly when the former is based upon macroscopic and not microscopic observations. There is again no inclusion of smoking or prior appendectomy despite strong evidence supporting a role for appendectomy in (re)shaping the natural history of ulcerative colitis.
No classification system or phenotypic approach has attempted to embrace the microscopic pattern of inflammation, damage and repair seen in both CD and ulcerative colitis. Limited work has been done on granulomata and their association with both clinical outcomes and with some genetic polymorphisms[19-21]. This is in contrast to several other fields of research, including liver disease, where histological appearances are used as important aids in determining the risk of fibrosis, and colorectal cancer, in which significant heterogeneity has been identified among cancers based both upon histological appearances and molecular markers[22,23].
STRENGTHS AND LIMITATIONS OF STUDIES TO DATE
Disease susceptibility and phenotype-genotype analyses have arrived in two “waves” since the seminal publications on NOD2/CARD15 and CD in 2001. The first wave of papers identified the key CD-causing variants in this gene and went on to establish an association with ileal location[12,24-29]. Some studies in this group supported a stronger association with stricturing disease behavior and not ileal location, after multivariate analysis[12,24,29]. Two studies carried out haplotype analysis[27,28]. The second wave included studies from smaller centers, predominantly across Europe, focusing on phenotype-genotype correlations[30,40]. There have been a limited number of studies that have investigated response to therapy[41,42].
The limitations of these studies center around two points: case-control ascertainment and phenotypic classification. Thus far, all of these studies have been sourced from specialist-based cohorts and often from large, tertiary referral centers[12,24-29]. Ascertainment of cases has also been influenced by earlier genetic studies where the primary aim has been “gene discovery” and hence there have been an exaggerated number of CD multiplex families[12,24]. Many of the studies combine the resources of multiple sites, which have contributed to collection of these multiplex families[12,24,25]. The method of ascertainment of controls is not always clear, and in the majority of cases they do not represent true population controls. Specifically, some studies have excluded controls with a family history of IBD thus biasing the population even further.
Since the majority of studies lack power to carry out detailed phenotype-genotype analyses, it is tempting to combine datasets in a meta-analysis. However, this has been hampered by a plethora of different approaches to clinical classification of the case population. Disease location is variously described as “any ileal” and “any colonic”, or “ileal and right colon” with “left colon and rectum”, without prior validation of these systems[24,26,28]. Similarly, disease behavior, which clearly evolves with time, was modified from the Vienna classification in several studies, again making it difficult to combine datasets. Disease duration has often been omitted. There have been inconsistencies in defining disease location, including the inclusion or exclusion of perianal abscess, combining ileal and ileocolonic locations without giving raw data on these locations individually and providing data on very specific subgroups, such as UC-like CD, where the number in this group (n = 56) is greater than the “colon-only” location as a whole (n = 29). More recent studies have concentrated on using the Vienna classification despite its known limitations, but have in some cases tried to address these limitations by carrying out supplemental analyses including investigation of an association between NOD2/CARD15 and stricturing disease independent of penetrating disease and internal penetrating disease independent of perianal disease.
In summary, studies thus far have provided us with some important information on the strength of the NOD2/CARD15 association with CD as a susceptibility gene but these have not been based on population-based cohorts. Population-based studies are awaited and may be best coordinated through an international effort using agreed upon methods of case-control ascertainment, genotyping and clinical classification. Associations with ileal location have provided us with important clues to disease pathogenesis, including the role of Paneth cells and defensins. However, extensive further work is required using much larger datasets that include other key variables, such as treatment received, to determine the mechanisms by which NOD2/CARD15 variants may increase the risk of stricturing disease, and whether these variants are also implicated in the development of internal penetrating disease.
RESULTS OF STUDIES ON NOD2/CARD15 TO DATE
Multiple studies have now investigated the contribution of variants in the LRR domain of the NOD2/CARD15 gene to development of CD. This has been summarized in a recent meta-analysis. Individuals carrying only one high-risk allele had 2.39-fold (95% CI: 2.00-2.86) increased risk of the disease, while those with 2 or more high risk alleles carried a 17.1-fold (10.7-27.2) increased risk of CD compared to individuals without any high-risk alleles. The greatest relative risk was identified for the SNP13 variant (OR 3.76, 95% CI: 3.22-4.38 for one variant) but significant heterogeneity existed among studies (P = 0.01)[40].
For the purposes of the rest of this paper, we have selected 15 studies that each provide both genotype and (some) phenotype data on a minimum dataset of 200 CD cases[12,24-36,38]. We will discuss the evidence for associations between NOD2/CARD15 and key clinical variables including age at diagnosis, disease location and behavior, need for surgery and the presence of granulomata. We have carried out further analyses on these studies as a combined dataset and as individual series, where sufficient clinical and genetic data are available in the original publication. We focus specifically on differences in phenotype and genotype among studies, and have carried out meta-analysis for NOD2 associations with disease location and behavior.
DISEASE LOCATION
Of the 15 studies selected, 13 provide clearly interpretable data on disease location. Of these, 10/13 support a significant association with ileal disease (L1)[12,25-27,31-34,36,38], while some also demonstrate an association with absence of colonic location (L2)[24]. As indicated above, some studies have combined ileal (L1) with ileocolonic (L3) patients to investigate this association. This may be due to a lack of power in subgroup analysis. Clinically, this may be inappropriate. There are limited data on the natural history of ileocolonic disease compared with ileal disease but associations have been made between L3 and younger age at diagnosis and an increased risk of surgical recurrence compared to other disease locations[43,44]. In addition, some patients with L3 may have their major disease burden in the colon, thus weakening the association with NOD2/CARD15 compared to pure ileal disease or L1. Once again, the interpretation of results depends heavily on the clarity of describing phenotypes, and in this case how the presence or absence of ileal disease was determined, whether by endoscopy, histology, radiology or a combination of these.
DISEASE BEHAVIOUR
Of 13 studies that provided data for the variable disease behaviour, 7 showed a significant association with stricturing behaviour[12,24,29,30,34,36,38]. Two studies show significant positive associations with both stricturing and penetrating disease behaviour[12,34], but one of these excluded perianal penetrating disease from the analysis. If included, this association with NOD2 variants is lost[12]. One study described a negative association with penetrating behaviour[26]. Penetrating disease is defined according to the Vienna classification in 8/13 studies and in another study it is defined as “internal penetrating” behaviour, separating it from perianal penetrating disease[12]. The rest of the studies describe their own “in house” definition of penetrating disease. Given the overlap between these forms of disease behaviour and the length of time some patients may take to develop “complex” CD, supplementary analyses seem necessary to clarify these observations. Specifically, stricturing CD will often occur in combination with penetrating disease, both being hallmarks of transmural inflammation. One may therefore predict that NOD2/CARD15 should be associated with both complications. If this is not the case, it may relate to a lack of association of NOD2/CARD15 with perianal penetrating disease that influences the relationship with penetrating disease overall, as suggested by Brant et al[12]. Alternatively, NOD2/CARD15 variants may promote the development of fibrosis over fissuring ulceration. These questions clearly need to be addressed in future studies.
AGE AT DIAGNOSIS
Five of thirteen studies that looked for an association between NOD2/CARD15 and age of diagnosis found a significant association with early onset of the disease[12,24,26,32,38]. Of these, 4 found association only for NOD2/CARD15 homozygotes and compound heterozygotes and 1 found association with these groups as well as with the frameshift variant alone. Importantly, these findings remained significant after multivariate analysis when important confounders such as disease location and stricturing behaviour were included. These data support the results of previous linkage studies that demonstrated an association between IBD1 and earlier age at diagnosis[45]. They also support the concept of pediatric CD being a “more genetic” disease, consistent with other polygenic disease models.
OTHER CLINICAL ASSOCIATIONS
Eleven studies provided data on surgery for CD and 8 of these provided NOD2/CARD15 genotype data on the “surgical” cohort. Two groups have identified an association between NOD2/CARD15 and increased risk of surgery. Lakatos et al demonstrated this to have an independent effect on need for surgery after logistic regression (OR 1.71, 95% CI: 1.13-2.62, P = 0.01) together with presence of stricturing disease behaviour. This cohort had the largest complete dataset of 527 patients and provided the most comprehensive analysis of NOD2/CARD15 genotype-phenotype relation[38]. The second study showing this association also carried out similar statistical analysis, finding a higher risk for surgery with those carrying two variant alleles (OR 17.8, 95% CI: 4.9-64.3)[36].
FURTHER ANALYSIS OF CLINICAL CHARACTERISTICS AND GENETIC ASSOCIATIONS
Statistical methods
Differences in proportions among different studies were tested using simple χ2 distribution obtained as the sum of [(Observed – Expected)2/Expected] with (R-1)*(C-1) degrees of freedom where R is the number of rows (studies) and C is the number of columns (locations, genotypes). The higher contribution to the χ2 from different studies suggested a higher degree of variation in the distribution of reported location.
Meta-analysis was performed by calculating the odds ratio and confidence interval using the frequency distribution of NOD2 variants and the variable of interest presented by studies involved in the analysis. Fixed analysis used an inverse variance method to calculate weight, and weighted average was calculated as a pooled estimate. Heterogeneity among studies was calculated using Cochran’s Q (sum of squared difference between individual study effect and pooled effect across studies). Analysis was performed using Meta command in STATA 9[46].
Differences among studies
Ten studies gave clear data on numbers of patients with pure ileal CD and their NOD2 variant frequencies. The majority of studies had small numbers in this important subgroup of CD (mean, 85 patients, range 23-136) with an overall NOD2 variant frequency of 41.1% (23.4%-53.6%). Comparisons between studies just failed to show a significant difference (P = 0.06). However, exclusion of one study from Finland, which is known to have lower NOD2 variant frequencies compared to the majority of other European populations, demonstrated similar proportions in the majority of these studies (P = 0.53).
In contrast, a comparison of disease location among 12 studies that provided data on this variable as per the Vienna and Montreal systems, demonstrated highly significant differences (P < 0.0001). Unlike the genotype data for NOD2, exclusion of any one study from this analysis did not influence heterogeneity. Proportions among studies did not appear to correlate with geographical region, but two groups emerged from the data; i.e., those with high rates of ileal CD (40%-50%), and those with lower rates of ileal disease (20%-30%). Does disease location for CD vary substantially among Caucasian populations? There are no population-based data to answer this question. Clearly, other factors may play a role, including rate of familial disease, smoking, type of center (medical or surgical bias), and methods of ascertainment. Interestingly, of the four studies with high rates of ileal disease, two had a high rate of familial CD (0.47, 0.86), one had a relatively low rate (0.11), and the fourth did not include data on familiality. Similarly, figures for surgery did not show consistency within this subgroup, ranging from 0.33 to 0.67, and rate of stricturing disease was surprisingly low (0.17-0.23). Alternatively, are there major differences in clinical assessment of patients and a corresponding reduction in interobserver agreement among centers? Studies have recently indicated that this may be so for both CD behavior and location[47,48]. The NOD2 analysis given above indicates that, at least for NOD2 variant ileal CD, there are limited differences among populations. As indicated above at the outset of this review, clinical characterization of CD, and IBD in general, remains a significant challenge.
META-ANALYSIS OF NOD2 ASSOCIA-TIONS WITH DISEASE LOCATION AND BEHAVIOUR
Location
These analyses confirmed the strong association between NOD2 variants and pure ileal CD compared to pure colonic disease, which was used as the reference group (Figure 1). Similar results were obtained in comparing patients with ileocolonic disease and the reference group (Figure 2). Both meta analyses achieved significance (P < 0.0001), while there was no significant heterogeneity demonstrated between studies.
Figure 1 Meta-analysis comparing odds of having a NOD2 variant (SNPs 8,12 and/or 13) among Ileal versus Colonic CD patients in 10 studies.
Pooled estimate = 2.50 (95% CI 1.97-3.25), P < 0.0001), test of heterogeneity among studies, P = 0.49.
Figure 2 Meta-analysis comparing odds of having a NOD2 variant among Ileocolonic versus Colonic CD patients in 10 studies.
Pooled estimate = 2.13 (95% CI 1.7-2.7), P < 0.0001, test of heterogeneity among studies, P = 0.927.
Behaviour
Nine studies provided adequate data for inclusion in these meta analyses, of which four provided data on disease duration (range, 8.2-17.6 years). A strong association was confirmed between NOD2 variants and stricturing CD (Figure 3) with inflammatory disease (non-stricturing, non-penetrating) as the reference group. Significant heterogeneity was detected among studies (P = 0.02) related to one “outlier”. Importantly, and not surprisingly, the other form of complex disease behavior also showed significant association with NOD2 variants (Figure 4). This association is likely to be strengthened if data were available for internal penetrating disease independent of perianal penetrating disease for all studies.
Figure 3 Meta-analysis comparing odds of having a NOD2 variant among Stricturing versus Inflammatory CD patients in nine studies.
Pooled estimate = 2.06 (95% CI 1.42-2.98), P < 0.0001, test of heterogeneity among studies, P = 0.02.
Figure 4 Meta-analysis comparing odds of having a NOD2 variant among Penetrating versus Inflammatory CD patients in nine studies.
Pooled estimate = 1.47 (95% CI 1.19-1.82), P < 0.0001, test of heterogeneity among studies, P = 0.15.
CONCLUSION
The discovery of NOD2 as the first susceptibility gene for CD has resulted in a leap forward in our understanding of disease pathogenesis. It has also highlighted the heterogeneity of the disease. NOD2 represents the “low-lying fruit” of CD genetics-other susceptibility genes for this disease and for ulcerative colitis may not carry the same relative risk. The phenotype-genotype studies carried out to date have provided us with the critical observations that link NOD2 variants to ileal location, and this has highlighted the role of Paneth cells and antimicrobial peptides. However, there remain a number of questions. All these studies have been drawn from specialist cohorts leaving no accurate figure for NOD2 population attributable risk. There is significant heterogeneity in phenotype among studies, raising concerns with respect to clinical characterization of patients at different centers and thus the utility of current classification systems and interobserver agreement in this field. Our meta-analysis has demonstrated a highly significant association between penetrating CD and NOD2 variants, and confirms the associations with stricturing behaviour and with both pure ileal and ileocolonic locations. These observations highlight the need for further clinical and genetic characterization of phenotypes not associated with NOD2, including inflammatory CD, perianal (penetrating) disease, and pure colonic disease.
S- Editor Wang J L- Editor Lutze M E- Editor Bai SH