Copyright
©The Author(s) 2021.
World J Gastroenterol. Oct 14, 2021; 27(38): 6399-6414
Published online Oct 14, 2021. doi: 10.3748/wjg.v27.i38.6399
Published online Oct 14, 2021. doi: 10.3748/wjg.v27.i38.6399
Table 1 Comparison between different types of machine learning approaches used in studies focused on polyp detection and classification
Characteristics | Support vector machine | Random forest | Decision trees | Deep neural networks | Context | Ref. |
High dimensional data | High | High | Moderate | High | Performance | Shen et al[12]; Goodfellow et al[26] |
Overlapped classes | Low | Low | Low | High | ||
Imbalance datasets | Moderate | High | Low | Moderate | ||
Non-linear data | Moderate | High | Moderate | High | ||
Larger dataset | Moderate1 | High1 | Low | High | ||
Outliers | Moderate | Moderate | Low | High | Robustness | Shen et al[12]; Yu et al[20] |
Over-fitting | Moderate | High | Low | High | ||
Handling of missing values | Poor | Good | Good | Good | ||
Reproducibility | High | High | High | Moderate | Complexity | Yu et al[20] |
Interpretability | Moderate | Moderate | High | Low |
Table 2 Most common evaluation metrics found in the state of the art for detection, segmentation and classification tasks
Term | Symbol | Description |
Positive | P | Number of real positive cases in the data |
Negative | N | Number of real negative cases in the data |
True positive | TP | Number of correct positive cases classified/detected |
True negative | TN | Number of correct negative cases classified/detected |
False positive | FP | Instances incorrectly classified/detected as positive |
False negative | FN | Instances incorrectly classified/detected as negative |
Area under curve | AUC | Area under the ROC plot |
Term | Task | Formulation |
Accuracy | C, D, S | (TP + TN)/(TP + TN + FN + FP) |
Precision/PPV | C, D, S | TP/(TP + FP) |
Sensitivity/Recall/TPR | C, D, S | TP/(TP + FN) |
Specificity/TNR | C, D, S | TN/(TN + FP) |
FPR | C, D, S | FP/(TN + FP) |
FNR | C, D, S | FN/(TP + FN) |
f1-score/DICE index | C, D, S | 2 ∙ (precision ∙ recall)/(precision + recall) |
f2-score | C, D, S | 4 ∙ (precision∙recall)/(4∙precision + recall) |
IoU/Jaccard index | D, S | (target ∩ prediction)/(target ∪ prediction) |
AAC | D, S | (detected area ∩ real area)/(real area) |
Table 3 Summary of studies focused on artificial intelligence applications for automatic polyp detection, classification, and segmentation
Study | Screening test | Imaging modality | Data type | AI-based algorithm | Contribution | Acc | Sen | Spe |
Wimmer et al[46] | Colonoscopy | WL, NBI | Images | k-nearest neighbours | Polyp classification: non-neoplastic, neoplastic | 80% | - | - |
Tajbakhsh et al[22] | Colonoscopy | WL | Images | Decision trees; Random forest | Automatic polyp detection | - | 88% | - |
Hu et al[21] | CT Colonography | Greyscale | Images | Random forest | Polyp classification: non-neoplastic, neoplastic | - | - | - |
Zhang et al[50] | Colonoscopy | WL, NBI | Images | CNN: Caffenet | Polyp detection and classification: benign from malignant | 86% | 88% | - |
Shin et al[23] | Colonoscopy | WL | Images | Support vector machine | Whole image classification: polyps from non-polyps | 96% | 96% | 96% |
Sánchez-González et al[32] | Colonoscopy | WL | Images | Random forest; CNN: Bayesnet | Polyp segmentation | 97% | 76% | 99% |
Tan et al[52] | CT Colonography | Greyscale | Images | Customized CNN | Polyp classification: adenoma from adenocarcinoma | 87% | 90% | 71% |
Fonolla et al[51] | Colonoscopy | WL, NBI, LCI | Images | CNN: EfficientNet | Polyp classification: benign from pre-malignant | 95% | 96% | 93% |
Hwang et al[46] | Colonoscopy | WL | Images | Customized CNN | Polyp detection and segmentation | - | - | - |
Park et al[53] | Colonoscopy | WL | Images | Customized CNN | Whole image classification: normal, adenoma and adenocarcinoma | 94% | ~94% | - |
Viscaino et al[54] | Colonoscopy | Greyscale | Images | Support vector machine; Decision treesk-nearest neighbours; Random forest | Whole image classification: polyp and non-polyp | 97% | 98% | 96% |
Table 4 Summary of publicly available colonoscopy datasets
Dataset | Year | Description | Data type | Ground truth |
CVC-ColonDB[29,58] | 2012 | 380 sequential WL images from 15 videos | Images (574 × 500 pixels) | Binary mask to locate the polyp |
CVC-PolypHD[58,59] | 2012 | 56 WL images | Images (1920 × 1080 pixels) | Binary mask to locate the polyp |
ETIS-Larib[55] | 2014 | 196 WL images from 34 video sequences (44 different polyps) | Images (1125 × 966) | Binary mask to locate the polyp |
CVC-ClinicDB[37] | 2015 | 612 sequential WL images from 31 videos sequences (31 different polyps) | Images (388 × 284 pixels) | Binary mask to locate the polyp |
ASU-Mayo[22] | 2016 | 38 short video sequences (NBI, WL) | Video (SD and HD video) | Binary mask for 20 training videos |
Colonoscopic dataset[49] | 2016 | 76 short video sequences (NBI, WL) | Video | Labels: hyperplastic, adenoma and serrated |
Kvasir-SEG[60] | 2017 | 1000 images with polyps | Images | Binary mask to locate the polyp |
CVC-ClinicVideoDB[61] | 2017 | 18 sequences | Video (SD video) | Binary mask to locate the polyp |
CP-CHILD-A, CP-CHILD-B[62] | 2020 | 10000 images | Images (256 × 256) | Labels: polyp and non-polyp |
- Citation: Viscaino M, Torres Bustos J, Muñoz P, Auat Cheein C, Cheein FA. Artificial intelligence for the early detection of colorectal cancer: A comprehensive review of its advantages and misconceptions. World J Gastroenterol 2021; 27(38): 6399-6414
- URL: https://www.wjgnet.com/1007-9327/full/v27/i38/6399.htm
- DOI: https://dx.doi.org/10.3748/wjg.v27.i38.6399