Copyright
©The Author(s) 2021.
Artif Intell Gastroenterol. Dec 28, 2021; 2(6): 141-156
Published online Dec 28, 2021. doi: 10.35712/aig.v2.i6.141
Published online Dec 28, 2021. doi: 10.35712/aig.v2.i6.141
Table 1 Strengths and weaknesses of machine learning methods in development of artificial intelligence models for gastrointestinal pathology
AI model | Advantages | Disadvantages |
Traditional ML (supervised) | Allows users to produce a data output from the previously labeled training set | Labeling big data can be time-consuming and challenging |
Users can reflect domain knowledge features | Accuracy depends heavily on the quality of feature extraction | |
Traditional ML (unsupervised) | Users do not label any data or supervise the model | Input data is unknown and not labeled by users |
Can detect patterns automatically | Users cannot get precise information regarding data sorting | |
Save time | Challenges during interpreting | |
CNN | Detects the important information and features without labeling | A large training data is required |
High performance in image recognition | Lack of interpretability (black boxes) | |
FCN | Provides computational speed | Requires large amounts of labeled data for training |
Automatically eliminates the background noise | High labeling cost | |
RNN | Can decide which information to remember from its past experience | Harder to train the model |
A deep learning model for sequential data | High computational cost | |
MIL | Does not require detailed annotation | A large amount of training data is required |
Can be applied to large data sets | High computational cost | |
GAN | Generates new realistic data resembling the original data | Harder to train the model |
Table 2 Artificial intelligence-based applications in gastric cancer
Ref. | Task | No. of cases/data set | Method | Performance |
Duraipandian et al[89] | Classification | 700 slides | GastricNet | Accuracy (100%) |
Cosatto et al[72] | > 12000 WSIs | MIL | AUC (0.96) | |
Sharma et al[31] | 454 cases | CNN | Accuracy (69%) | |
Qu et al[90] | 9720 images | DL | AUCs (up to 0.97) | |
Yoshida et al[32] | 3062 gastric biopsies | ML | Overall concordance rate (55.6%) | |
León et al[91] | 40 images | CNN | Accuracy (up to 89.7%) | |
Liang et al[92] | 1900 images | DL | Accuracy (91.1%) | |
Sun et al[93] | 500 images | DL | Accuracy (91.6%) | |
Tomita et al[94] | 502 images1 | Attention-based DL | Accuracy (83%) | |
Wang et al[95] | 608 images | Recalibrated multi-instance-DL | Accuracy (86.5%) | |
Iizuka et al[33] | 1746 biopsy WSIs | CNN, RNN | Accuracy (95.6%), AUCs (up to 0.98) | |
Bollschweiler et al[41] | Prognosis | 135 cases | ANN | Accuracy (93%) |
Hensler et al[42] | 4302 cases | QUEEN technique | Accuracy (72.73%) | |
Jagric et al[43] | 213 cases | Learning vector quantization NN | Sensitivity (71%), specificity (96.1%) | |
Lu et al[36] | 939 cases | MMHG | Accuracy (69.28%) | |
Jiang et al[37] | 786 cases | SVM classifier | AUCs (up to 0.83) | |
Liu et al[40] | 432 tissue samples | SVM classifier | Accuracy (up to 94.19%) | |
Korhani Kangi and Bahrampour[38] | 339 cases | ANN, BNN | Sensitivity (88.2% for ANN, 90.3% for BNN)Specificity (95.4% for ANN, 90.9% for BNN) | |
Zhang et al[39] | 669 cases | ML | AUCs (up to 0.831) | |
García et al[44] | Tumor infiltrating lymphocytes | 3257 images | CNN | Accuracy (96.9%) |
Kather et al[56] | Genetic alterations | 1147 cases2 | Deep residual learning | AUC (0.81 for gastric cancer) |
Kather et al[47] | > 1000 cases3 | NN | AUC (up to 0.8) | |
Fu et al[57] | > 1000 cases4 | NN | Variable across tumors/gene alterations. Strongest relations in whole genome duplications |
Table 3 Artificial intelligence-based applications in colorectal cancer
Ref. | Task | No. of cases/data set | Method | Performance |
Xu et al[96] | Classification | 717 patches (N, ADC subtypes) | AlexNet | Accuracy (97.5%) |
Awan et al[97] | 454 cases (N, ADC grades LG vs HG) | NN | Accuracy (97%, for 2-class; 91%, for 3-class) | |
Haj-Hassan et al[98] | 30 multispectral image patches (N, AD, ADC) | CNN | Accuracy (99.2%) | |
Kainz et al[99] | 165 images (benign vs malignant) | CNN (LeNet-5) | Accuracy (95%-98%) | |
Korbar et al[34] | 697 cases (N, AD subtypes) | ResNet | Accuracy (93.0%) | |
Yoshida et al[100] | 1328 colorectal biopsy WSIs | ML | Accuracy (90.1% for adenoma) | |
Wei et al[35] | 326 slides (training), 25 slides (validation) 157 slides (internal set) | ResNet | 157 slides: Accuracy 93.5% vs 91.4%(pathologists) 238 slides: Accuracy 87.0% vs 86.6%(pathologists) | |
Ponzio et al[101] | 27 WSIs (13500 patches) (N, AD, ADC) | VGG16 | Accuracy (96%) | |
Kather et al[47] | 94 WSIs1 | ResNet18 | AUC (> 0.99) | |
Yoon et al[102] | 57 WSIs (10280 patches) | VGG | Accuracy (93.5%) | |
Iizuka et al[33] | 4036 WSIs (N, AD, ADC) | CNN/RNN | AUCs (0.96, ADC; 0.99, AD) | |
Sena et al[103] | 393 WSIs (12565 patches) (N, HP, AD, ADC) | CNN | Accuracy (80%) | |
Bychkov et al[45] | Prognosis | 420 cases | RNN | HR of 2.3, AUC (0.69) |
Kather et al[46] | 1296 WSIs | VGG19 | Accuracy (94%-99%) | |
Kather et al[46] | 934 cases | DL (comp. 5 networks) | HR for overall survival of 1.63-1.99 | |
Geessink et al[104] | 129 cases | NN | HR of 2.04 for disease free survival | |
Skrede et al [105] | 2022 cases | Neural networks with MIL | HR 3.04 | |
Kather et al[47] | Genetic alterations | TCGA-DX (93408 patches)1TCGA-KR (60894 patches) | ResNet18 | AUC (0.77), TCGA-DXAUC (0.84), TCGA KR) |
Echle et al[55] | 8836 cases (MSI) | ShuffleNet DL | AUC (0.92-0.96 in two cohorts) | |
Kather et al[47] | Tumor microenvironment analysis | 86 WSIs (100000)1 | VGG19 | Accuracy (94%-99%) |
Shapcott et al[48] | 853 patches and 142 TCGA images | CNN with a grid-based attention network | Accuracy (65-84% in two sets) | |
Swiderska-Chadaj et al[49] | 28 WSIs | FCN/LSM/U-Net | Sensitivity (74.0%) | |
Alom et al[106] | 21135 patches | DCRN/R2U-Net | Accuracy (91.9%) | |
Sirinukunwattana et al[107] | Molecular subtypes | 1206 cases | NN with domain-adversarial learning | AUC (0.84-0.95 in the two validation sets) |
Weis et al[50] | Tumor budding | 401 cases | CNN | Correlation R (0.86) |
Table 4 Summary of challenges and suggested solutions in development process of artificial intelligence applications
Process | Challenges | Suggested solutions |
Ethical considerations | Lack of patient’s approval for commercial use | Approval for both research and product development |
Design of AI models | Underestimation of end-users’ needs | Collaboration with skate holders |
Optimization of data-sets | CNN: Large amounts of images | Augmentation techniques, transfer learning |
Rare tumors: Limited number of images | Global data sharing | |
Variations in preanalytical and analytical phases | AI algorithms to standardize staining, color properties, and WSIs quality | |
Annotation of data-sets | Interobserver variations in diagnosis | MIL algorithms |
Discrepancies among performances for trained algorithms | ||
Validation | Presence of ground truth without objectivity | Multicenter evaluations that include many pathologists and data-set |
Regulation | Lack of current regulatory guidance specific for AI tools | New guidelines and regulations for safer and effective AI tools |
Implementation | Changes in work-flow | Selection of AI applications that will speed up the work-flow |
IT infrastructure investment | Augmented microscopy directed to the cloud network service | |
The relative inexperience of pathologists | Training about AI, integration of AI in medical education | |
AI applications that lack interpretability ( Black-box) | Constructions of interpretable models, generating attention heat map | |
Lack of external quality assurance | Sheme for this purpose should be designed | |
Legal implications | The performance of AI algorithms should be assured for reporting |
- Citation: Alpsoy A, Yavuz A, Elpek GO. Artificial intelligence in pathological evaluation of gastrointestinal cancers. Artif Intell Gastroenterol 2021; 2(6): 141-156
- URL: https://www.wjgnet.com/2644-3236/full/v2/i6/141.htm
- DOI: https://dx.doi.org/10.35712/aig.v2.i6.141