Kastenberg D, Bertiger G, Brogadir S. Bowel preparation quality scales for colonoscopy. World J Gastroenterol 2018; 24(26): 2833-2843 [PMID: 30018478 DOI: 10.3748/wjg.v24.i26.2833]
Corresponding Author of This Article
David Kastenberg, MD, FACP, FACG, AGAF, Professor, Division of Gastroenterology and Hepatology, Department of Medicine, Thomas Jefferson University, 132 South 10th Street, Suite 480, Philadelphia, PA 19107, United States. david.kastenberg@jefferson.edu
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Minireviews
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
David Kastenberg, Division of Gastroenterology and Hepatology, Department of Medicine, Thomas Jefferson University, Philadelphia, PA 19107, United States
Gerald Bertiger, Hillmont GI, Flourtown, PA 19031, United States
Stuart Brogadir, Medical Affairs, Ferring Pharmaceuticals, Inc, Parsippany, NJ 07054, United States
Author contributions: Kastenberg D planned the first draft of the manuscript; all authors contributed to writing of the manuscript and approved the final version for submission.
Conflict-of-interest statement: David Kastenberg has received research support from and served as a consultant for Medtronic, and is on the advisory boards of Ferring Pharmaceuticals Inc. and Salix Pharmaceuticals. Gerald Bertiger has served as a consultant and has been a part of the speakers bureau for Ferring Pharmaceuticals Inc. Stuart Brogadir is an employee of Ferring Pharmaceuticals Inc. Editorial support was provided by The Curry Rockefeller Group, LLC, which was funded by Ferring Pharmaceuticals.
Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Correspondence to: David Kastenberg, MD, FACP, FACG, AGAF, Professor, Division of Gastroenterology and Hepatology, Department of Medicine, Thomas Jefferson University, 132 South 10th Street, Suite 480, Philadelphia, PA 19107, United States. david.kastenberg@jefferson.edu
Telephone: +1-215-9558900 Fax: +1-215-5032578
Received: March 13, 2018 Peer-review started: March 14, 2018 First decision: April 18, 2018 Revised: May 16, 2018 Accepted: May 26, 2018 Article in press: May 26, 2018 Published online: July 14, 2018 Processing time: 121 Days and 8 Hours
Abstract
Colorectal cancer (CRC) is the third most common cancer and second leading cause of cancer-related death in the United States. Colonoscopy is widely preferred for CRC screening and is the most commonly used method in the United States. Adequate bowel preparation is essential for successful colonoscopy CRC screening. However, up to one-quarter of colonoscopies are associated with inadequate bowel preparation, which may result in reduced polyp and adenoma detection rates, unsuccessful screens, and an increased likelihood of repeat procedure. In addition, standardized criteria and assessment scales for bowel preparation quality are lacking. While several bowel preparation quality scales are referred to in the literature, these differ greatly in grading methodology and categorization criteria. Published reliability and validity data are available for five bowel preparation quality assessment scales, which vary in several key attributes. However, clinicians and researchers continue to use a variety of bowel preparation quality measures, including nonvalidated scales, leading to potential confusion and difficulty when comparing quality results among clinicians and across clinical trials. Optimal clinical criteria for bowel preparation quality remain controversial. The use of validated bowel preparation quality scales with stringent but simple scoring criteria would help clarify clinical trial data as well as the performance of colonoscopy in clinical practice related to quality measurements.
Core tip: Adequate bowel preparation is essential for proper visualization of the colonic mucosa to optimize lesion detection for a successful colonoscopy. Clinicians and researchers continue to use a variety of bowel preparation quality measures, including de novo, nonvalidated scales in clinical studies, leading to potential confusion, and creating difficulty when comparing bowel preparation quality results across clinical trials. Based on data evaluating different bowel preparation quality scales in the literature, and published criteria that define the most desirable measures to be used in such grading scales, the Boston Bowel Preparation Scale is currently recommended as standard.
Citation: Kastenberg D, Bertiger G, Brogadir S. Bowel preparation quality scales for colonoscopy. World J Gastroenterol 2018; 24(26): 2833-2843
Colorectal cancer (CRC) is the third most common cancer, with an estimated risk of occurring in 1 of 18 persons during their lifetime, and is the second most common cause of cancer-related adult deaths in the United States[1,2]. Approximately 135000 new CRC cases and 50000 CRC deaths were projected to occur in 2017 in the United States[1]. For average risk individuals, the United States Preventive Services Task Force and other public health and professional medical bodies recommend CRC screening using colonoscopy, computerized tomography colonography, sigmoidoscopy, double-contrast barium enema, high-sensitivity guaiac or immunochemical fecal occult blood testing, or stool DNA testing (which is combined with immunochemical blood testing) beginning at the age of 50 years[2-4]. Colonoscopy is a preferred and the most widely used method for CRC screening in the United States[4-6], based on data showing this procedure is correlated with decreased CRC incidence and deaths, most likely through the detection and removal of premalignant polyps[7-10].
Adequate bowel preparation is essential to ensure sufficient visualization of the colonic mucosa and to optimize lesion detection for successful colonoscopy utilized for CRC screening[4,11]. However, study data indicate that up to one-quarter of colonoscopies may be conducted with inadequate bowel preparation[12,13], which is correlated with lower detection of polyps and adenomas vs adequate preparation (typically good/excellent quality)[12,14-16]. A meta-analysis of 27 studies found that inadequate bowel preparation for colonoscopy CRC screening reduced detection of small adenomas by 47% (OR = 0.53, CI: 0.46-0.62; P < 0.001) vs adequate preparation (excellent/good/fair); this relationship was weaker but still significant for advanced adenomas (OR = 0.74, CI: 0.62-0.87; P < 0.001)[17]. Other studies have reported overall adenoma miss rates of 42%-48% for initial colonoscopies with inadequate or low-quality bowel preparation, based on findings at repeat colonoscopies[13,18]. Inadequate bowel preparation for colonoscopy may also result in prolonged procedures, more frequent repeat colonoscopies (at shorter than recommended intervals) and related increased costs, lower cecal intubation rates, and higher risk of electrocautery[6,11,19-21]. Studies in various international populations have found that inadequate cleansing is a factor in approximately 20%-70% of incomplete colonoscopies[22-25]. Professional gastroenterology societies recommend that clinical practices aim for minimum adequate bowel preparation rates of 85%-90%, and that bowel preparation quality be documented at the time of the screening[6,26].
Currently, no standard criteria or definition exists for qualitative terms such as “adequate”, “inadequate”, “excellent”, “good”, “fair”, or “poor”; in some scales, adequate cleansing is defined as a composite of “good” and “excellent”[11,26]. Physician reporting on quality of bowel preparation, as well as overall colonoscopy quality, is highly inconsistent and often missing important elements, which may be attributable to lack of clear and consistent quality assessment standards[27]. Therefore, this review was conducted to summarize and discuss currently available bowel preparation quality scales and highlight the benefits of using a reliable and validated scale in both clinical practice and clinical trials of bowel preparation agents.
COMPONENTS OF A BOWEL PREPARATION QUALITY SCALE
Essential attributes of a dependable bowel preparation quality scale include reliability and validity[11]. Scale reliability involves the degree to which an instrument yields reproducible, or consistent, results for the same investigator (intrarater reliability) or among different investigators (interrater reliability), upon repeated testing[11,28]. Validity indicates how well the scale measures what it is designed to assess, which can be determined via several methods[29]. Validity may be assessed by comparison with results of other established and accepted scales used for the same purpose (i.e., bowel preparation quality) in the same test population, referred to as construct validity. Scale validity may also be assessed by correlation with other specific criteria measuring relevant clinical outcomes, in this case, overall colonoscopy quality; this is referred to as criterion-related validity or predictive validity[29,30].
A commonly used criterion for overall quality of CRC screening colonoscopy is the adenoma detection rate (ADR), defined as the proportion of all CRC screening colonoscopies performed by a physician that reveal at least one adenoma[6,31]. Studies have shown that colonoscopy ADR is strongly, inversely associated with reduced interval CRC rates (CRC diagnosed between the time of screening colonoscopy and the scheduled time of surveillance colonoscopy, which was up to 10 years)[32,33], and that increasing ADRs are correlated with reduced CRC incidence and mortality[34]. Some data also indicate that the polyp detection rate (PDR), the number of patients with at least one polyp removed during screening CRC, may also be a useful parameter of colonoscopy quality, particularly since it appears to correlate well with ADR[6]. However, use of the PDR raises additional questions related to the precise definition of “polyp”. Other questions include whether the detection rates of sessile serrated polyps (SSPs), advanced adenomas, and multiple adenomas (as opposed to a “one and done” approach) should be used as key indicators of colonoscopy quality in addition to the ADR and PDR[6]. However, clinical data are insufficient for resolution of these issues, and no guidelines for correlation of bowel preparation quality with detection rates for SSPs, advanced adenomas, and multiple adenomas have yet been established[6]. Thus, ADR appears to be the best criterion currently available, as it is relatively easy to measure and has been shown to correlate with interval cancer rate.
The cecal intubation rate, an indicator of colonoscopy completion (reaching the cecum or anastomosis, if present), is another acknowledged quality measure[6,21,26]. Cecal intubation is essential for visualization of the proximal colon, including the caecum, where many colorectal neoplasms are located, in particular SSPs[6]. However, data on the independent association of cecal intubation rate with CRC risk have been mixed[32,35]. Longer withdrawal time is associated with higher ADR and higher SSP detection and is also considered a key criterion of colonoscopy quality secondary to ADR[6,36-38].
Another recommended criterion of colonoscopy quality is the level of adherence to recommended post-polypectomy and post-cancer surveillance intervals, which are based on study data[2,6,39,40]. The United States Multi-Society Task Force on Colorectal Cancer (USMSTFCC) has recommended that this criterion may serve as the overall indication of clinical adequacy of a bowel preparation[11]. Intra-procedure flushing and suctioning to remove fluid and semisolid debris is often performed during colonoscopy[11]. Therefore, the USMSTFCC recommends that bowel preparation quality should be assessed on withdrawal after washing and suctioning[11]. This criterion relates primarily to clinical adequacy, where washing and suctioning is taken into account, and is less relevant for the comparison of different bowel preparation agents, where pre-wash grading of bowel cleanse quality may better reflect preparation agent efficacy.
VALIDATED BOWEL PREPARATION SCALES
The most well established and commonly used validated bowel preparation quality scales in clinical trials include the Aronchick Scale[41,42], the Boston Bowel Preparation Scale (BBPS)[43-49], and the Ottawa Bowel Preparation Scale (OBPS)[50] (Table 1). Other instruments that have been validated, but are less commonly used, include the Harefield Cleansing Scale (HCS)[51] and the Chicago Bowel Preparation Scale (CBPS)[52] (Table 1). A summary of validation studies is found in Table 2.
Excellent: Small volume of liquid; > 95% of mucosa seen
Total score range: Minimum 1 (excellent) to maximum 5 (inadequate) Scoring performed before washing or suctioning No separate ratings for segments; global colon rating only No threshold for adequate/inadequate provided
2
Good: Clear liquid covering 5%-25% of mucosa, but > 90% of mucosa seen
3
Fair: Semisolid stool could not be suctioned or washed away, but > 90% of mucosa seen
4
Poor: Semisolid stool could not be suctioned or washed away and < 90% of mucosa seen
5
Inadequate: Repeat preparation/screening needed
Ottawa Bowel Preparation Scale (by colon segment)
0
Excellent: Mucosal detail clearly visible, almost no stool residue; if fluid present, it is clear, almost no stool residue
Total score (obtained by adding scores for each segment + total colon fluid score) range: Minimum 0 (excellent) to maximum 14 (inadequate) Scoring performed before washing or suctioning Rates cleansing by colon segment: Right colon, mid-colon, and rectosigmoid colon (Figure 1) No threshold for adequate/inadequate provided
1
Good: Some turbid fluid or stool residue, but mucosal detail still visible without need for washing/suctioning
2
Fair: Some turbid fluid of stool residue obscuring mucosal detail; however, mucosal detail becomes visible with suctioning, washing not needed
3
Poor: Stool present obscuring mucosal detail and contour; a reasonable view is obtained with suctioning and washing
4
Inadequate: Solid stool obscuring mucosal detail and not cleared with washing and suctioning
Total colon fluid score range: Minimum 0 (small amount of fluid) to maximum 2 (large amount of fluid) Scoring performed before washing or suctioning Single score for the total colon No threshold for adequate/inadequate provided
1
Moderate amount of fluid
2
Large amount of fluid
Boston Bowel Preparation Scale (by colon segment)
0
Unprepared colon segment with mucosa not seen because of solid stool that cannot be cleared
Total score (obtained by adding scores for each segment) range: Minimum 0 (very poor) to maximum 9 (excellent) Scoring performed after washing or suctioning Segments separately rated: Right colon (including cecum and ascending colon); transverse (includes hepatic and splenic flexures); and left colon (descending and sigmoid colon, and rectum) Threshold optimally is total score of ≥ 6 AND ≥ 2 per segment
1
Portion of mucosa of the colon segment seen, but other areas of segment not well seen because of staining, residual stool, and/or opaque liquid
2
Minor amount of residual staining, small fragments of stool, and/or opaque liquid, but mucosa of colon segment is well seen
3
Entire mucosa of colon segment well seen, with no residual staining, small fragments of stool, or opaque liquid
Harefield Cleansing Scale (by colon segment)
0
Irremovable, heavy, hard stools
Total score (obtained by adding scores for each segment) range: Minimum 0 (very bad) to maximum 20 (very good) Scoring performed after washing or suctioning Segments separately rated: Rectum, sigmoid, left, transverse, right colon Threshold for successful cleansing = Grade A: no segment scored < 3 or 4, or Grade B: ≥ 1 segment scored 2 but no segment < 2; Unsuccessful cleansing = Grade C: ≥ 1 segment scored 1 but no segment < 1, or Grade D: ≥ 1 segment scored 0
1
Semisolid, only partially removable stools
2
Brown liquid/fully removable semi-solid stools
3
Clear liquid
4
Empty and clean
Chicago Bowel Preparation Scale (by colon segment)
0
Unprepared colon segment with stool that cannot be cleared (> 15% of mucosa not seen)
Total score (obtained by adding scores for each segment) range: Minimum 0 (unprepared) to maximum 36 (excellent) Scoring performed before (fluid) and after (mucosal cleaning) washing or suctioning Segments separately rated: Right (cecum to mid-hepatic flexure), transverse (mid-hepatic flexure to mid-splenic flexure), and left colon (mid-splenic flexure to distal rectum) No threshold for adequate/inadequate provided
5
Portion of mucosa in segment seen after cleaning, but up to 15% of the mucosa not seen because of retained material
10
Minor residual material after cleaning, but mucosa of segment generally well seen
11
Entire mucosa of segment well seen after washing
12
Entire mucosa of segment well seen before washing or suctioning
Chicago Bowel Preparation Scale (total colon)
0
Little fluid (≤ 50 cc)
Total score range: Minimum 0 (little fluid) to maximum 3 (large amount of fluid) Scoring performed before washing or suctioning No threshold for adequate/inadequate provided Not incorporated into total score for segments
Figure 1 Bowel preparation quality scale segments.
Depiction of bowel segments from validation study of Ottawa Bowel Preparation Scale[50]. Before washing or suctioning, each segment is scored on a scale of 0-4 for cleansing, and the total colon is scored for fluid quantity on a scale of 0-2. The total score ranges from 0 (excellent) to 14 (inadequate).
Table 2 Reliability and validation data for bowel preparation scales.
PDR by score 40% for scores ≥ 5 vs 24% for scores < 5 (P < 0.02) Need for repeat CSP due to inadequate bowel prep 2% for scores ≥ 5 vs 73% for scores < 5 (P < 0.001) Correlation with colonoscope insertion time PCC: r = -0.16 (P < 0.003) Correlation with colonoscope withdrawal time PCC: r = -0.23 (P < 0.001)
ICC values for: Total colon: 0.91 Right colon: 0.88 Transverse colon: 0.83 Left colon: 0.79
Correlations with ability to exclude polyps > 5 mm 100%, 88%, 82%, 33%, and 0% of physicians deemed bowel preparation adequate to exclude polyps > 5 mm at scores of ≥ 8, 7, 6, 5, and ≤ 4 respectively Correlations with surveillance recommendations after normal CSP Score < 5: 100% recommended ≤ 1 yr Scores 5-6: mean recommended interval 4.3 (± 3.9) yr Scores ≥ 7: 100% recommended 10 yr
ICC values: Total colon: 0.90/0.63 wtd κ Right colon: 0.93/0.91 wtd κ Transverse colon: 0.88/0.86 wtd κ Left colon: 0.50/0.38 wtd κ
PDR Scores ≥ 8 superior vs scores < 8 (44.9% vs 33.0%; P = 0.04) Colonoscope withdrawal time PCC: r = -0.167 (P < 0.001) Colonoscope insertion time PCC: r = 0.018 (P = 0.695)
ICC value: 0.457 Test-retest κ values: Range, 0.33 to 0.85 Intrarater2: 0.28 to 0.64 Internal consistency3: 0.81, 0.86
Best score cutoff for satisfactory bowel preparation ≥ 2 for each segment: Sensitivity, 99% and specificity, 83% Correlation with Aronchick scale PCC: r = 0.833 AUC of ROC analysis (vs Aronchick scale scores) 0.945 for total colon
ICC values for: Range, 0.624 to 0.702 for all segments
Correlations of scores with adequate cleansing Adequate: Scores of 25-36 (≥ 95% of mucosa visualized) Inadequate: Scores of 0-24 (< 95% of mucosa visualized)
Aronchick scale
The Aronchick Scale was the first bowel preparation quality scale to be evaluated for reliability[41,42]. This scale characterizes the percentage of the total colonic mucosal surface covered by fluid or stool, without scoring for separate colon segments, and is performed before washing or suctioning (Table 1). A validity study found that interobserver reliability kappa intraclass correlation coefficients (ICCs) were high for the cecum (0.76) and the total colon (0.77), but were reduced for the distal colon (0.31) and ascending colon segments[42]. The Aronchick Scale is one of the most commonly used validated bowel preparation quality scales in clinical trials and clinical practice.
Ottawa Bowel Preparation Scale
The OBPS measures mucosal cleanliness by colon segment, including the right colon, mid-colon, and rectosigmoid colon, on a scale of 0 (excellent) to 4 (inadequate) for each (Table 1 and Figure 1), and is also scored before washing or suctioning[50]. However, in contrast to the Aronchick scale, the OBPS measures fluid quantity separately, with scores ranging from 0 (small volume) to 2 (large volume) for the total colon. Additionally, the OBPS does not tie scoring to subjective estimates of the percentage of the mucosa that is visible, which the investigators suggested might improve interobserver reliability (Table 1)[50]. In a study of reliability and validity compared with the Aronchick scale, the Pearson correlation coefficients for interobserver ratings were superior for the OBPS vs the Aronchick (0.89 vs 0.62, respectively; P < 0.001)[50]. Similarly, the kappa ICCs also significantly favored the OBPS vs the Aronchick scale [0.94 (95%CI: 0.91-0.96) vs 0.77 (95%CI: 0.65-0.84), respectively; P < 0.001]. Interrater consistency was found to be stronger with the OBPS vs the Aronchick scale, and reliability and agreement of the OBPS for the three different colon segments measured were very high, and not significantly different between segments (0.92 kappa, right colon; 0.88 kappa, mid-colon; 0.89 kappa, rectosigmoid; 0.94 kappa, total colon).
A prospective study of the OBPS aimed to identify an optimal cut-off score for bowel preparation adequacy/inadequacy in 211 patients undergoing colonoscopy at a single center[53]. The receiver operating characteristic (ROC) analysis used in this study found that an OBPS score cutoff of ≥ 8 identified inadequate bowel preparation with a sensitivity of 100% and a specificity of 91%. Another study in 150 consecutive patients undergoing colonoscopy reported strong concordance between the OBPS and a visual analogue scale measuring bowel cleansing among both nurses (r = 0.8268) and physicians (r = 0.8095), P < 0.0001 for both[54]. The concordance in scoring between nurses and physicians was r = 0.6010; P < 0.0001.
Boston Bowel Preparation Scale
The BBPS has been validated in multiple clinical studies[11,47,55]. Developed in 2009, this scale was designed to address specific issues affecting bowel preparation quality and scoring: (1) The scale stipulates that scoring is to be conducted upon withdrawal and after all flushing and suctioning of fluid have been completed; (2) scoring is applied by colon segments, as in the OBPS, based on potential for variance in bowel preparation between segments; and (3) subjective, qualitative terms, such as excellent, good, fair, or poor, are replaced by numbered scores that are correlated to more clearly described colonic conditions, including features such as staining, liquid, and stool fragments (Table 1)[47]. Each segment of the colon is scored from 0 to 3, with higher scores indicating superior cleansing, and summed for a total score that can range from 0 to 9 (Table 1).
The initial validation study for the BBPS involved 633 CRC screening colonoscopies in a single center, and was applied by endoscopists who had undergone training on how to use the scale before participating in the study[47]. The median BBPS total score was 6. The ICC for interobserver agreement of total BBPS scores was 0.74 (95% predictive interval: 0.67-0.80), and the weighted kappa value for intraobserver agreement was 0.77 (95%CI: 0.66-0.87)[47]. Validity assessment was based on the correlations of BBPS scores with relevant clinical outcomes and more traditional scale categories, including “excellent”, “good”, “fair”, “poor”, or “unsatisfactory”. Of the 633 patients who received a CRC screening colonoscopy, 243 (38%) had at least one polyp detected, and the PDR was significantly higher for patients with BBPS scores ≥ 5 vs those for patients with BBPS score < 5 (40% vs 24%, respectively; P < 0.02). The frequency of repeat colonoscopy attributable to inadequate bowel preparation was significantly higher in patients with scores < 5 vs those with scores ≥ 5 (73% vs 2% of cases, respectively; P < 0.001). Total BBPS scores were inversely associated with colonoscopic insertion (r = -0.16; P < 0.003) and withdrawal times (r = -0.23; P < 0.001). In addition, a significant trend in mean BBPS score correlating with excellent, good, fair, poor, or unsatisfactory, as separately scored by the raters, was observed (P < 0.001 for trend).
A follow-up study investigated interobserver reliability and clinical outcome correlations of BBPS scores for individual segments, and relationship of scores to polyp detection in 119 screening colonoscopies rated by nine full-time faculty and three fellows at a single center[43]. All (100%) raters judged the bowel preparation adequate to exclude polyps > 5 mm with a ≥ 8 BBPS score, vs 88% of physicians when the score was 7, 82% when the score was 6, 33% when the score was 5, and 0% with a score of ≤ 4. Thus, a score of ≥ 6 was a particularly important threshold, since approximately 80% of physicians found the bowel preparation adequate at that score vs only one-third or less at BBPS scores of ≤ 5. In patients who had undergone a normal screening colonoscopy, a score of < 5 prompted all physicians to recommend repeat colonoscopy within one year, while a score of ≥ 7 was correlated with a recommendation for the next colonoscopy to occur in 10 years (among all physicians). BBPS segment scores were positively correlated with improved PDRs for the left and right colon, but no association was found for the transverse colon.
A further validation study was aimed at identifying a cut-off score for adequacy/inadequacy of bowel preparation[44]. This retrospective study of 2516 normal CRC screening colonoscopies performed by 74 endoscopists found that follow-up was recommended in 10 years for 90% of cases with a total BBPS score ≥ 6 in which all three segments had scores ≥ 2 (n = 2295), while 96% of examinations with total BBPS scores of 0-2 (n = 26) recommended follow-up within one year (Figure 2). Screenings with total scores of 3-5 (n = 167) had variable recommendations. Based on these findings, the investigators suggested that a total BBPS score of ≥ 6 and/or all segment scores ≥ 2 may serve as a standard definition of “adequate for 10-year follow-up”[44]. However, a prospective, observational study in a large, national endoscopic consortium found that inadequate single BBPS segment scores at the initial, average-risk screening colonoscopy were correlated with significantly greater risk of polyps at a second colonoscopy, suggesting that both a total score of ≥ 6 and all segment scores ≥ 2 should be required as an adequacy standard for 10-year follow-up[56]. This assessment was affirmed by a study in 438 colonoscopies in men, which found that BBPS segment scores of 2 or 3 (with 2 being noninferior to 3) was indicative of adequate bowel preparation for detection of adenomas > 5 mm, and for repeat colonoscopy at standard, guideline-recommended intervals (both parameters are USMSTFCC-recommended criteria for bowel preparation adequacy)[11,57].
Figure 2 Percentage of screening colonoscopy examinations in which 10-year follow-up was recommended after a negative colonoscopy, stratified by total Boston Bowel Preparation Scale Score[44].
Harefield Cleansing Scale
The HCS, developed in the 1990s, is scored by colon segment, as are the OBPS and BBPS[51]. Like the BBPS, the HCS is also scored after washing and suctioning are completed, and replaces qualitative terms (e.g., “excellent” or “good”) with direct descriptions of cleansing quality correlated with score numbers (Table 1)[51]. Grading is performed in five colon segments and ranges from 0-4 (higher numbers indicating better quality of cleanse) for each. Although total scores are derived by adding the separate segment scores, an “acceptable” score is possible only when the mucosa is 100% visible in all five colon segments. A validation study of the HCS compared with the Aronchick scale in 337 colonoscopies reviewed by four gastroenterologists found that there was a high degree of Pearson correlation between the two scales (r = 0.833), and the Spearman correlation coefficient was -0.778 (correlation is negative because improved cleanse quality is represented by different directions in the HCS and Aronchick scale)[51]. The ROC curve analysis vs the Aronchick scale showed an area under the curve of 0.945, and a sensitivity of 99% and specificity of 83% at the optimum score cut-off point. Interrater reliability analysis yielded an ICC of 0.457 (95%CI: 0.366-0.539). Cohen kappa scores for individual segments between investigators showed slight-to-fair agreement ranging from 0.15-0.27. Internal consistency was acceptable, based on a Cronbach alpha coefficient of 0.81, and the test-retest reliability assessment showed an overall kappa of 0.639. No analyses of correlations with relevant clinical outcomes such as the ADR or adherence to recall guidelines were performed, due to insufficient patient population.
Chicago Bowel Preparation Scale
Like the HCS, the CBPS was developed to address perceived limitations in other commonly used bowel preparation scales[52]. The main features of the scale are shown in Table 1. Scoring is performed both before and after washing or suctioning, and a separate fluid score is included as a secondary measure (not incorporated into the total score as in the OBPS). The total and fluid scoring categories were designed to measure both the quality of visualization and the intraprocedural effort required to clean the mucosa to attain adequate visualization. These parameters were intended to help clinicians assess the cleansing efficacy of different bowel preparations[52]. A CBPS validation study prospectively compared the results of the CBPS with the OBPS, the BBPS, and a theoretical, dichotomous scale that simply defined “adequate cleansing” as ability to see ≥ 95% of the mucosa (after it was cleansed), with “inadequacy” being defined as visibility in < 95% in 150 colonoscopies at a single center[52]. In this study, kappa coefficients for interrater agreement were higher for the CBPS (0.624-0.702) than the OBPS (0.493-0.655) and the BBPS (0.545-0.661), but these differences were not significant. Kappa coefficients for the total colon fluid scores for the CBPS and OBPS, and Pearson correlations coefficients for interrater agreement, were also similar. For the OBPS, scores from 8-10 were graded inadequate; for the BBPS, a score of ≤ 4 was graded inadequate; and for the CBPS, total scores ≤ 24 were graded inadequate. No clinically relevant parameters were assessed for validation in this study.
ADDITIONAL VALIDATED SCALE COMPARISON DATA
The OBPS and the BBPS were compared in a study that reviewed prospectively collected data from patients who underwent CRC screening or surveillance colonoscopies over a two-year period between August 2013 and July 2015[58]. Of the 655 colonoscopies, overall detection rates for polyp, adenoma, right-side adenoma, and sessile serrated adenoma (SSA) were 42.8%, 32.8%, 20.8%, and 1.2%, respectively. A significant Pearson correlation was observed between the two scales (P < 0.001). However, the ROC curves for the OBPS vs the BBPS were not significantly different for the detection rates, respectively, for polyps (0.550 vs 0.513), adenoma (0.544 vs 0.519), right-side adenoma (0.469 vs 0.516), and SSA (0.712 vs 0.790). The investigators concluded that the choice of either the OBPS or the BBPS may not strongly affect the measurement of bowel preparation quality.
DISCUSSION
Quality scales
All currently available bowel preparation quality scales are imperfect, have limitations, and are dependent upon subjective descriptions of luminal contents expressed as categories (“excellent”, “good”, etc.) or numbers, depending on the scale utilized. A standard, fully validated, and universally accepted scale for use in clinical practice and trials has not yet been established. Among the scales, the Aronchick scale is the most well-known and widely used clinically and in clinical trials to date; however, this scale rates cleanse quality of the colon as a whole and provides no details regarding differences between individual segments.
Colon segments cleansing
Guidance is somewhat vague for clinicians regarding grading of the entire colon when individual segments are suboptimally cleansed. This issue may arise more often in the proximal colon, which is harder to clean than other segments and more likely to contain flat lesions such as sessile serrated polyps/adenomas[50,51]. Segment-specific bowel preparation quality scales, such as the OBPS or BBPS, may provide a clearer distinction between cleanse quality of the proximal colon compared with other segments. Furthermore, establishing a minimum acceptable score for adequacy within each colon segment, as has been done for the BBPS, is helpful in determining overall colon cleansing adequacy. A BBPS validation study provided information used to create an “adequate cleansing” threshold score of at least 2 in each of three colon segments.
Need for washing and suctioning
Grading before or after washing and suctioning is another important factor which differs between scales. Many clinicians are using the Aronchick scale incorrectly, as they grade the bowel preparation as good or fair after washing and suctioning. While scales that grade cleanse quality after washing may correlate better with quality measures such as ADR, or the likelihood of an alteration in CRC screening follow-up recommendations, scales that grade before washing can provide a better reflection of a bowel preparation product’s efficacy independent of the endoscopist. Similarly, the OBPS gives points based on the total fluid in the colon, which leads to inaccurate grading if using water immersion/exchange.
The OBPS entails scoring by colon segments, thus accounting for variation by segment in bowel preparation quality/visibility; however, it also incorporates the presence of luminal fluid before suctioning[11,50]. The OBPS validation data are largely dependent on correlations with the Aronchick scale, which itself has limited validation and may not correlate with ADR[50]. The BBPS differs in several key aspects from the Aronchick and OBPS scales[47]. To begin, it requires washing and suctioning to be completed before the bowel preparation is graded[47]. The HCS requires rating only after completion of flushing and suctioning, providing a score for the entire colon as well as for individual segments[51,52].
Grading scales validity and reliability
The reliability and validation data for BBPS is more extensive compared with the Aronchick and OBPS scales and include good supporting data correlating scores with key clinical outcomes. These validation studies have provided information to create a threshold for adequate cleansing of a score of at least 2 in each of three colon segments[44,57]. It should also be noted, however, that one study found no significant difference between the BBPS and OBPS regarding key indicators of colonoscopy quality, such as the PDR and ADR, in screening or surveillance colonoscopy[58]. Concerning the HCS and CBPS, each has reported acceptable reliability data, although the CBPS validation study was based on findings from only two raters[51,52]. While the HCS validation assessment was the only one to provide test-retest and internal consistency data for reliability, its validity evaluation was based only on correlations with the Aronchick scale[51]. Although the CBPS was compared with the OBPS and BBPS, no correlations of this scale with key clinical outcomes, such as ADR and adherence to screening and surveillance colonoscopy intervals, have been reported[52]. The CBPS has more specific definitions and requires measurement of fluid suctioned (Table 1), but the complexity may be challenging for the clinician to assess correctly; thus, it may not easily translate to clinical practice. Hence, the usefulness of these scales for clinical practice or trials remains unclear.
Several unique, nonvalidated bowel preparation scales have been developed for use in trials of agents including oral sulfate solution (OSS) (Suprep®, Braintree Laboratories, Braintree, MA, United States)[59], OSS plus sulfate-free electrolyte lavage solution (Suclear®, Braintree Laboratories, Braintree, MA, United States)[60], and polyethylene glycol electrolyte solution plus ascorbic acid (MoviPrep®, Salix Pharmaceuticals, Bridgewater, NJ, United States)[59,61,62]. The grading criteria used in these study- and product-specific scales often differ greatly from validated scales.
The substantial ramification of using nonvalidated scales is illustrated by a post hoc analysis of data from two sodium picosulfate and magnesium citrate (P/MC) clinical trials. Investigators analyzed the data from the studies after altering the definition of “adequate” in the Aronchick scale, which had been used in the original trials, to more closely resemble what has been used in some studies utilizing nonvalidated scales[55]. With this revised definition, > 98% of all P/MC patients were considered responders, compared with 79%-87% using the original OBPS and Aronchick scale categorization criteria. Multiple studies have used more than one validated scale from among the Aronchick, OBPS, and BBPS scales for assessment of bowel preparation quality, providing additional comparative data[63-70]. Generally, the results of these trials have been concordant in assessment of bowel preparation quality, with similar mean total scores being reported for overall quality, and similar comparative assessments of different bowel preparations.
While scales for assessment of bowel preparation quality for CRC screening colonoscopy have improved, establishing a standard, validated scale is essential to optimize CRC colonoscopy screening. The Boston bowel preparation scale has several limitations, but appears nonetheless to be the best available option, and is therefore recommended as the current standard for use in clinical practice. Given the importance preparation plays in multiple colonoscopy quality measures, including the need to repeat the procedure when cleansing is inadequate, it may be advantageous for clinicians to adopt one language to describe cleansing quality. The continued use of multiple scales with varying criteria may undermine the validity of study findings and the accuracy of colonoscopy for CRC screening and surveillance.
For colonoscopy clinical trials, the use of different, and sometimes nonvalidated, scales across studies is one of many reasons comparisons between studies is fraught with difficulties. By incorporating a standard, validated grading scale, we may ensure that the findings are generalizable and comparable with other studies and facilitate progress in the development of future bowel preparations. Future developments in bowel preparation quality assessment are likely to involve establishment of an improved “gold standard” and further refinement of the accuracy of quality assessment. Continued improvement of quality standards for CRC prevention, further studies of ADR and withdrawal time, and recommended years of follow-up are also warranted.
Footnotes
Manuscript source: Unsolicited manuscript
Specialty type: Gastroenterology and hepatology
Country of origin: Germany
Peer-review report classification
Grade A (Excellent): 0
Grade B (Very good): B, B
Grade C (Good): C
Grade D (Fair): 0
Grade E (Poor): 0
P- Reviewer: Bordas JM, Choi YS, Kotwal V S- Editor: Wang XJ L- Editor: A E- Editor: Yin SY
Winawer SJ, Zauber AG, Fletcher RH, Stillman JS, O’brien MJ, Levin B, Smith RA, Lieberman DA, Burt RW, Levin TR. Guidelines for colonoscopy surveillance after polypectomy: a consensus update by the US Multi-Society Task Force on Colorectal Cancer and the American Cancer Society.CA Cancer J Clin. 2006;56:143-159; quiz 184-185.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 229][Cited by in F6Publishing: 249][Article Influence: 13.8][Reference Citation Analysis (0)]
Johnson DA, Barkun AN, Cohen LB, Dominitz JA, Kaltenbach T, Martel M, Robertson DJ, Boland CR, Giardello FM, Lieberman DA. Optimizing adequacy of bowel cleansing for colonoscopy: recommendations from the US multi-society task force on colorectal cancer.Gastroenterology. 2014;147:903-924.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 231][Cited by in F6Publishing: 269][Article Influence: 26.9][Reference Citation Analysis (0)]
Adler A, Wegscheider K, Lieberman D, Aminalai A, Aschenbeck J, Drossel R, Mayr M, Mroß M, Scheel M, Schröder A. Factors determining the quality of screening colonoscopy: a prospective study on adenoma detection rates, from 12,134 examinations (Berlin colonoscopy project 3, BECOP-3).Gut. 2013;62:236-241.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 147][Cited by in F6Publishing: 159][Article Influence: 14.5][Reference Citation Analysis (0)]
Smith CL, Roy A, Kalra AP, Daskalakis C, Kastenberg D. Adenoma detection on repeat colonoscopy after previous inadequate preparation.J Gastroenterol Hepatol Res. 2013;2:911-917.
[PubMed] [DOI][Cited in This Article: ]
Rees CJ, Thomas Gibson S, Rutter MD, Baragwanath P, Pullan R, Feeney M, Haslam N; British Society of Gastroenterology, the Joint Advisory Group on GI Endoscopy, the Association of Coloproctology of Great Britain and Ireland. UK key performance indicators and quality assurance standards for colonoscopy.Gut. 2016;65:1923-1929.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 169][Cited by in F6Publishing: 200][Article Influence: 25.0][Reference Citation Analysis (0)]
Aronchick CA, Lipshutz WH, Wright SH, DuFrayne F, Bergman G. Validation of an instrument to assess colon cleansing.Am J Gastroenterol. 1999;94:2667.
[PubMed] [DOI][Cited in This Article: ]
Chan M, Birnstein E, Patel N, Chan L, Laine L, Kline M. Ottawa score of 8 or greater is an optimal cut-off score for inadequate bowel preparation.Am J Gastroenterol. 2011;106:S431-S432.
[PubMed] [DOI][Cited in This Article: ]
Martinato M, Krankovic I, Caccaro R, Scacchi M, Cesaro R, Marzari F, Colombara F, Compagno D, Judet S, Sturniolo GC. Assessment of bowel preparation for colonoscopy: comparison between different tools and different healthcare professionals.Dig Liver Dis. 2013;45S:S195-S196.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 3][Cited by in F6Publishing: 3][Article Influence: 0.3][Reference Citation Analysis (0)]
Lee YJ, Kim ES, Cho KB, Park KS, Lee Jy, Lee YS, Choi WY, Kwon TH. SU1731 Comparison of Ottawa and Boston bowel preparation scales for adenoma detection rate.Gastrointest Endosc. 2016;83 Suppl:AB413.
[PubMed] [DOI][Cited in This Article: ]
Bitoun A, Ponchon T, Barthet M, Coffin B, Dugué C, Halphen M; Norcol Group. Results of a prospective randomised multicentre controlled trial comparing a new 2-L ascorbic acid plus polyethylene glycol and electrolyte solution vs. sodium phosphate solution in patients undergoing elective colonoscopy.Aliment Pharmacol Ther. 2006;24:1631-1642.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 76][Cited by in F6Publishing: 56][Article Influence: 3.1][Reference Citation Analysis (0)]
Gweon TG, Kim SW, Noh YS, Hwang S, Kim NY, Lee Y, Lee SW, Lee SW, Lee JY, Lim CH. Prospective, randomized comparison of same-day dose of 2 different bowel cleanser for afternoon colonoscopy: picosulfate, magnesium oxide, and citric acid versus polyethylene glycol.Medicine (Baltimore). 2015;94:e628.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 20][Cited by in F6Publishing: 22][Article Influence: 2.4][Reference Citation Analysis (0)]