Copyright
©The Author(s) 2022.
Artif Intell Gastroenterol. Dec 28, 2022; 3(5): 142-162
Published online Dec 28, 2022. doi: 10.35712/aig.v3.i5.142
Published online Dec 28, 2022. doi: 10.35712/aig.v3.i5.142
AI models | Strengths | Weaknesses |
ML, Traditional, Supervised | Data output can be produced from the previously labeled training set | Labeling big data takes a considerable amount of time and can be challenging |
Allows users to reflect domain knowledge features | Feature extraction quality significantly affects the accuracy | |
ML, Traditional, Supervised | Users do not supervise the model or label any data | Input data is unknown and not labeled |
Patterns are detected automatically | Precise information related to data sorting is not provided | |
Save time | Interpretation is challenging | |
SVM | Suitable for more efficient regression and classification analysis with high-dimensional data | Not suitable for large data sets. Requires more time for training; Low performance in overlapping classes |
CNN | No labeling is required for important information and features | Lack of interpretability due to black boxes |
The performance capacity in image recognition is high | ||
FCN | Provides computational speed | A large amount of labeled data for training is required |
The background noise is automatically eliminated | The labeling cost is high | |
RNN | Able to decide which information to remember from past experiences | The model is hard to train |
A suitable deep learning model for sequential data | The computational cost is high | |
MIL | A detailed annotation is not required | A large amount of training data is required |
Suitable to be performed on large datasets | The computational cost is high | |
GAN | The potential to produce new realistic data that resembles the original data | The model is hard to train |
Ref. | Task | Data sets | Algorithm/Model | Performance | Comments |
Xu et al[55] | NL/ADC/MC/SC/PC/CCTA | 717 patches | AlexNet | Accuracy: 97% | The model provides the classifications of tumor subtypes |
Korbar et al[56] | NL/HP/SSP/TSA/TA/TVA-VA | Training set: 458 WSIs; Test set: 239 WSıs | ResNET | F1 Score: 88.8%; Accuracy: 93%; Precision: 89.7%; Recall: 88.3% | The model may reduce the workload of pathologists in the assessment of colorectal polyps |
Haj-Hassan et al[57] | NL/AD/ADC | 30 patients, Multispectral image patches | CNN | Accuracy: 99.2% | CNN allows the classification of CRC tissue types using pre-segmented regions of interest |
Ponzio et al[58] | NL/AD/ADC | 27 WSIs | VGG16 | Accuracy: 96% | TL considerably outperforms the CNN fully trained on CRC samples on the same test dataset |
Sena et al[59] | NL/HP/AD/ADC | 393 images | CNN | Accuracy: 80% | DL may provide a valuable tool to assist pathologists in the histological classification of CR tumors |
Iizuka et al[60] | NL/AD/ADC | 4036 WSIs + 500WSIs | CNN/RNN | AUCs: 0.96-0.99 | Integrating DL models in pathology workflow would be of high benefit for easing the workload of pathologists |
Wei et al[61] | NL//TA/TVA/VA/HP | 1182 WSIs | ResNet | Accuracy: 93.5% (Internal test set); Accuracy: 87% (External test set) | This model may assist pathologists by improving the accuracy of CRC screening |
Awan et al[62] | NL/Low GR/High GR | 139 images | CNN | Accuracy: 97% (two-class), 91% (three-class) | The model provides the classifications of tumor subtypes based on the shape of glands |
Sirinukunwattana et al[97] | Prediction of MSTs | 510 WSIs (FOCUS), 431 WSIs (TCGA), 265 WSIs (GRAMPIAN cohort) | Inception V3 | AUCs: 0.9 (FOCUS); 0.94 (TCGA), 0.85 (GRAMPIAN cohort) | RNA expression classifiers can predict from H-E stained images, opening the door to cheap and reliable biological stratification within routine workflows |
Echle et al[98] | MSI vs MSS | 6406 WSIs (Training); 771 WSIs (External validation) | ShuffleNet | AUC: 0.92 (Training); AUC: 0.96 (External validation) | The model provides a low-cost evaluation of MSI without molecular testing |
Kather et al[80] | MSI vs MSS | 60894 patches (TCGA-CRC-KR); 93408 patches (TCGA-CRC-DX) | ResNet18 | AUC: 0.84 (TCGA-CRC-KR); AUC: 0.77 (TCGA-CRC-DX) | This method may lead to improvements in molecular subtype screening workload in pathology |
Kather et al[77] | Prediction of molecular Als | 426 patients (TCGA-CRC); 379 patients (DACHS) | ShuffleNet | AUROC: 0.76 | The algorithm predicts a wide range of molecular alterations from routine, H-E stained slides |
Kruger et al[99] | Prediction of MSTs | 919 WSIs | ResNet 34 | AUCs: Mean: 0.87; CMS1: 0.85; CMS2: 0.92, CMS3: 0.85; CMS4: 0.86 | The MIL framework can identify morphological features indicative of different molecular subtypes |
Popovici et al[100] | Prediction of MSTs | 300 WSIs | VGG-F | Accuracy: 0.84; Recall: 0.85; Precision: 0.84 | The image-based classifier shows a significant prognostic value similar to the molecular counterparts |
Cao et al[101] | MSI vs MSS | 429 patients (TCGA-COAD); 785 patients (Asian-CRC) | EPLA | AUC: 0.88 (TCGA-COAD); AUC: 0.85 (Asian-CRC) | This pathomics-based model provides MSI estimation directly from images without molecular testing |
Bilal et al[102] | Prediction of molecular Als | 502 slides (TCGA-CRC-DX); 47 slides (PAIP) | ResNet18, ResNet34, HoVerNet | AUROCS: HM (0.81 vs 0.71); MSI (0.86 vs 0.74); CIN (0.83 vs 0.73), BRAFmut (0.79 vs 0.66), TP53mut (0·vs 0.64), KRASmut (0.60), CIMP (0.79) | This algorithm is based on non-annotated images and uses only slide-level labels to predict the status of CRC pathways and mutations |
Kwak et al[110] | LNM prediction | 164 patients | CNN, U-Net | AUROC: 67% | PTS score is a potential prognostic parameter for LNM in CRC |
Pai et al[111] | LNM prediction | 230 patients (training), (136 testing) | CNN | AUROC: 79% | The model allows to identify and quantify a broad spectrum of histological features, including LNM in CRC |
Kiehl et al[112] | LNM prediction | 3013 patients | ResNET18 | AUROC: 74.1% | DL-based analysis may help predict the LNM of patients with CRC using routine HE-stained slides |
Weis et al[120] | Tumor Budding (Pan-CK) | 381 patients | CNN | Spatial clusters of tumor buds correlates to N status (P: 0.003) | The model is a feasible and valid assessment tool for tumor budding on WSIs and can predict prognosis |
Kather et al[121] | ADI, DEB, LYM, MUC, SM | 86 slides (Training), 25 slides (Testing); 862 slide (TCGA-COAD) | VGG19 | AUC: 98.7% HR: 2.29 (OS); 1.92 (RFS); Deep stroma score HR: 1.99 (P: 0.002), Shorter OS | This model can assess the human TME and predict prognosis directly from histopathological images |
Shapcott et al[122] | TME (EC/IC/FC/MC) | 853 patches, 142 images (TCGA-COAD) | CNN | Accuracy: 76% (detection), 65% (classification) | The model provides the assessment of TME in CRC slides |
Sirinukunwattana et al[123] | a-4 tissues classes; b- prediction of DM | 102 cases | Spatially Constrained CNN | a-AUROC: 90.4-99.9%; b-AUROC: 58.6-63.8% | The algorithm provides a digital marker for estimating the risk of DM |
Swiderska-Chadaj et al[124] | TME Detection of ICs | 28 WSIs | FCN/LSM/U-Net | F1-score of 0.80; Sensitivity: 74%; Precision: 86% | DL approaches are reliable for automatically detecting lymphocytes in IHC-stained CRC tissue sections |
Geessink et al[115] | TSR | 129 slides | CNN | HR: 2.48 (DSS); 2.05 (DFS) | CNN defined TSR as an independent prognosticator |
Zhao et al[125] | TSR | 499 patients (Discovery cohort); 315 patients (Validation cohort:) | CNN | TSR, independent prognostic parameter. HRs: 2.48 (Discovery cohort); 2.08 (Validation cohort) | CNN allows objective evaluation of TSR |
Zhao et al[126] | Mucus tumor ratio low vs mucus tumor ratio high | 814 patients | CNN | HRs: 1.88 (Discovery cohort); 2.09 (Validation cohort) | The DL quantified mucus tumor ratio is an independent prognostic factor in CRC |
Bychkov et al[132] | Prognosis LR vs HR | 420 TMA | VGG-16 | HR: 2.3 | The model extracts more prognostic information from the tissue morphology than the experienced human observer |
Skrede et al[133] | Prognosis (CSS) | 1122 patients (Validation cohort) | DoMorev1 | HRs: 1.89 (uncertain vs good); 3.84 (poor vs good) | The digital marker has the potential to identify patients at LR and HR and provides the selection of treatment |
Jiang et al[134] | a-HRR vs LRR b-Poor vs good prognosis | 101 patients (Traning); 67 patients (Validation); 47 (TCGA-COAD) | InceptionResNetV2 | a-HRs: 8.98 (training); 10.69 (other 2 test groups); b-HRs: 10.687 (training); 5.03 (other 2 test groups) | The selected model offers an independent prognostic predictor which allows stratification of stage III CRC into risk groups |
Ref. | Task | Data sets | Algorithm/Model | Performance | Comments |
Yasuda et al[66] | NC, GR1, GR2, GR3; PDL-1, ATF7IP/MCAF1 | 66 WSIs | SV, ML, wndchrm | AUCs: 0.98-0.99 | The model allows grading emphasizing a correlation between molecular expression and tissue structures |
Kanavati et al[67] | NC, ADC-D, ADC-O | 1-stage training: 1950 WSIs, 2-stage training: 874 WSIs | CNN and RNN | AUCs: 0.95-0.99 | The tool can aid pathologists by potentially accelerating their diagnostic workflow |
Fu et al[68] | NC, TC, MC, PC | Training 2938 WSIs, Testing 980 WSIs | StoHisNet | The accuracy: 94.69%, F1 score: 94.96%, Recall: 94.95%, Precision: 94.97% | The model has high performance in the multi-classification on gastric images and shows strong generalization ability on other pathological datasets |
Su et al[69] | NC, WD, PD, MSS vs MSI | GR: Training 348 WSIs, Testing 88 WSIs MSS: Training 212 WSIs, Testing: 52 WSIs, MSI: Training 136 WSIs, Testing: 36 WSIs | ResNet-18 | PD vs WD, F1 score: 0.8615, PD vs WD vs NC, F1 score: 0.8977; MSI vs MSS accuracy: 0.7727 | The proposed system integrated the tumor GR and MSI status recognition problems into the same workflow and was suitable for exploring the relationships between pathological features and molecular status |
Muti et al[79] | MSI vs MSS; EBV (+) vs EBV (-) | 2823 patients with known MSI status; 2685 patients with known EBV status | CNN, Shufflenet | MSI vs MSS, AUROCs: 0.723-0.863; EBV (+) vs EBV (-), AUROCs: 0.672-0.859 | DL-based classifiers have the potential to provide faster decisions for pathologists and to offer therapeutic options tailored to the molecular profile of the individual patient |
Kather et al[80] | MSI vs MSS | Training 81 patients +216 patients (TCGA-STAD) | ResNet-18 | AUC: 0.84 | This system provides significant improvements in molecular alterations screening workflow |
Kather et al[81] | EBV (+) vs. EBV (-) | Training 317 patients (TCGA-STAD) | CNN, VGG19 | AUC: 0.80 | This workflow enables a fast and low-cost method to identify EBV and enables pathologists to check the plausibility of computer-based image classification ( the black box of DL) |
Hinata et al[82] | EBV+MSI/dMMR vs EBV- non MSI/dMMR | UTokyo training cohort: 326 patients; TCGA training cohort: 48 patients | CNNs,VGG16, VGG19, ResNet50, EfficientNetB0 | AUCs: 0.901–0.992 (Utokyo cohort); AUCs: 0.809–0.931 (TCGA cohort) | The model detects immunotherapy-sensitive GC subtypes from histological images at a lower cost and in a shorter time than the conventional methods |
Zheng et al[83] | EBV (+) vs EBV (-) | EBV (+) 203 WSIs; EBV (-) 803 WSIs | EBVNet | AUROC: 0.969, Internal validation; AUROC: 0.941, External dataset AUROC: 0.895, TCGA dataset | The human-machine fusion significantly improves the diagnostic performance of both the EBVNet and the pathologist, provides an approach for the identification of EBV(+) GC, and may help effectively select patients for immunotherapy |
Flinner et al[87] | EBV, MSI, GS, CIN | Training 84 WSIs (TCGA-STAD); Testing: 133 WSIs (TCGA-STAD) | CNN, DenseNet161 | AUC: 0.76 for four classes | The simplified molecular TCGA and GC subclasses could be predicted by DL directly based on H-E staining |
Jang et al[88] | CDH1, ERBB2, KRAS, PIK3CA, TP53 mutations | 425 FF slides (TCGA-STAD); 320 FT slides (TCGA-STAD) | CNN, Inception-v3 | AUCs (FF-FT): CDH1 (0.667-0.778), ERBB2(0.63-0.833), KRAS (0.657-0.838); PIK3CA (0.688-0.761), TP53 (0.572-0.775) | When trained with appropriate tissue data, DL could predict genetic mutations in H-E-stained tissue slides |
Huang et al[109] | Metastatic LNs | 983 WSIs | ESCNN | AUC: 0.9936 | ESCNN improves the accuracy of pathologists in identifying metastatic LNs, micrometastases, and isolated tumor cells, allowing for shortening the review time |
Hu et al[107] | Metastatic LNs | 222 patients | RCNN, Xception and DenseNet-121 | Accuracy 97.13%; PPV: 93.53, NPV: 97.99% | The system can be implemented into clinical workflow to assist pathologists in preliminary screening for LN metastases in GC patients |
Matsushima et al[108] | Metastatic LNs | 827 lymph nodes | CNN | AUROC: 0.9994 | This DL-based diagnosis-aid system can assist pathologists in detecting LN metastasis in GC and reduce their workload |
Wang et al[106] | Metastatic LNs, T/LNM | 9366 slides (7736 with metastasis) | Resnet-50 | LNM (+) vs (-): Sensitivity 98.5%, Specificity 96.1%; T/LNM: HR: 2.05 (univariate analysis); 1.39 (multivariate analysis) | This system can assist pathologists in detecting LN metastasis in GC and reduce their workload. Besides, T/LNM is prognostic of OS in GC patients |
Hong et al[116] | dTSR (HE and CK7) | Training 13 WSIs; Testing 358 WSIs | cGAN | Kappa value: 0.623 (dTSR and vTSR); AUROC: 0.907; OS (P: 0.0024) | By diagnosing TSR in GC, this model predicts OS in the advanced stage of GC |
Meier et al[127] | TME + Ki-67 | 248 patients | CNN | HRs: Ki67&CD20: 1.364, CD20&CD68: 1.338; Ki67&CD68: 1.473 | In combination with a panel of IHC markers, this model predicts the prognosis of patients with GC |
Huang et al[128] | OS | Training: 2261 pictures; Internal validation: 960 pictures | GastroMIL | HR: 2.414 (univariate analysis), 1.843 (multivariate analysis) | The risk score computed by MIL-GC was proved to be the independent prognostic value of GC |
Jiang et al[129] | 5-YS, 5-YDFS | 786 patients | ML, SVM | AUCs: 5-YS: 0.834; 5-YDFS: 0.828 | The classifier can accurately distinguishes GC patients with different OS and DFS and identifies a subgroup of patients with stage II and III disease who could benefit from adjuvant chemotherapy |
Jiang et al[130] | Low SVM vs High SVM, 5-YS, 5-YDFS | Training: 223 patients; Internal validation: 218 patientsExternal validation: 227 patients | ML, SVM | AUCs: 5-YS: 0.818; 5-YDFS: 0.827 | SVM signature distinguish GC patients with different OS and DFS and identifies a subgroup of patients with stage II and III disease who could benefit from adjuvant chemotherapy |
Wang et al[131] | TME | 172 patients | CGSignature powered by AI | AUROCs: 0.960 ± 0.01 (binary classification), 0.771 ± 0.024 to 0.904 ± 0.012 (ternary classification) | Digital grade cancer staging produced by CGSignature predicts the prognosis of GC and significantly outperforms the AJCC 8th edition Tumor Node Metastasis staging system |
- Citation: Yavuz A, Alpsoy A, Gedik EO, Celik MY, Bassorgun CI, Unal B, Elpek GO. Artificial intelligence applications in predicting the behavior of gastrointestinal cancers in pathology. Artif Intell Gastroenterol 2022; 3(5): 142-162
- URL: https://www.wjgnet.com/2644-3236/full/v3/i5/142.htm
- DOI: https://dx.doi.org/10.35712/aig.v3.i5.142