Review
Copyright ©The Author(s) 2022.
Artif Intell Gastroenterol. Dec 28, 2022; 3(5): 142-162
Published online Dec 28, 2022. doi: 10.35712/aig.v3.i5.142
Table 1 General features of machine learning methods in the development of artificial intelligence models in gastrointestinal pathology
AI models
Strengths
Weaknesses
ML, Traditional, SupervisedData output can be produced from the previously labeled training set Labeling big data takes a considerable amount of time and can be challenging
Allows users to reflect domain knowledge featuresFeature extraction quality significantly affects the accuracy
ML, Traditional, SupervisedUsers do not supervise the model or label any dataInput data is unknown and not labeled
Patterns are detected automatically Precise information related to data sorting is not provided
Save timeInterpretation is challenging
SVMSuitable for more efficient regression and classification analysis with high-dimensional dataNot suitable for large data sets. Requires more time for training; Low performance in overlapping classes
CNNNo labeling is required for important information and featuresLack of interpretability due to black boxes
The performance capacity in image recognition is high
FCNProvides computational speed A large amount of labeled data for training is required
The background noise is automatically eliminatedThe labeling cost is high
RNNAble to decide which information to remember from past experiencesThe model is hard to train
A suitable deep learning model for sequential dataThe computational cost is high
MILA detailed annotation is not requiredA large amount of training data is required
Suitable to be performed on large datasetsThe computational cost is high
GANThe potential to produce new realistic data that resembles the original dataThe model is hard to train
Table 2 AI-based applications in pathology for the determination of tumor behavior in colorectal carcinomas
Ref.
Task
Data sets
Algorithm/Model
Performance
Comments
Xu et al[55]NL/ADC/MC/SC/PC/CCTA717 patchesAlexNetAccuracy: 97%The model provides the classifications of tumor subtypes
Korbar et al[56]NL/HP/SSP/TSA/TA/TVA-VATraining set: 458 WSIs; Test set: 239 WSısResNETF1 Score: 88.8%; Accuracy: 93%; Precision: 89.7%; Recall: 88.3%The model may reduce the workload of pathologists in the assessment of colorectal polyps
Haj-Hassan et al[57]NL/AD/ADC30 patients, Multispectral image patchesCNNAccuracy: 99.2%CNN allows the classification of CRC tissue types using pre-segmented regions of interest
Ponzio et al[58]NL/AD/ADC27 WSIsVGG16Accuracy: 96%TL considerably outperforms the CNN fully trained on CRC samples on the same test dataset
Sena et al[59]NL/HP/AD/ADC 393 images CNNAccuracy: 80%DL may provide a valuable tool to assist pathologists in the histological classification of CR tumors
Iizuka et al[60]NL/AD/ADC4036 WSIs + 500WSIsCNN/RNNAUCs: 0.96-0.99Integrating DL models in pathology workflow would be of high benefit for easing the workload of pathologists
Wei et al[61] NL//TA/TVA/VA/HP1182 WSIsResNetAccuracy: 93.5% (Internal test set); Accuracy: 87% (External test set)This model may assist pathologists by improving the accuracy of CRC screening
Awan et al[62]NL/Low GR/High GR139 images CNNAccuracy: 97% (two-class), 91% (three-class) The model provides the classifications of tumor subtypes based on the shape of glands
Sirinukunwattana et al[97]Prediction of MSTs510 WSIs (FOCUS), 431 WSIs (TCGA), 265 WSIs (GRAMPIAN cohort)Inception V3AUCs: 0.9 (FOCUS); 0.94 (TCGA), 0.85 (GRAMPIAN cohort)RNA expression classifiers can predict from H-E stained images, opening the door to cheap and reliable biological stratification within routine workflows
Echle et al[98]MSI vs MSS6406 WSIs (Training); 771 WSIs (External validation)ShuffleNetAUC: 0.92 (Training); AUC: 0.96 (External validation)The model provides a low-cost evaluation of MSI without molecular testing
Kather et al[80]MSI vs MSS60894 patches (TCGA-CRC-KR); 93408 patches (TCGA-CRC-DX)ResNet18AUC: 0.84 (TCGA-CRC-KR); AUC: 0.77 (TCGA-CRC-DX)This method may lead to improvements in molecular subtype screening workload in pathology
Kather et al[77]Prediction of molecular Als426 patients (TCGA-CRC); 379 patients (DACHS) ShuffleNetAUROC: 0.76 The algorithm predicts a wide range of molecular alterations from routine, H-E stained slides
Kruger et al[99]Prediction of MSTs919 WSIsResNet 34AUCs: Mean: 0.87; CMS1: 0.85; CMS2: 0.92, CMS3: 0.85; CMS4: 0.86The MIL framework can identify morphological features indicative of different molecular subtypes
Popovici et al[100]Prediction of MSTs300 WSIsVGG-FAccuracy: 0.84; Recall: 0.85; Precision: 0.84The image-based classifier shows a significant prognostic value similar to the molecular counterparts
Cao et al[101]MSI vs MSS429 patients (TCGA-COAD); 785 patients (Asian-CRC)EPLAAUC: 0.88 (TCGA-COAD); AUC: 0.85 (Asian-CRC)This pathomics-based model provides MSI estimation directly from images without molecular testing
Bilal et al[102]Prediction of molecular Als502 slides (TCGA-CRC-DX); 47 slides (PAIP)ResNet18, ResNet34, HoVerNetAUROCS: HM (0.81 vs 0.71); MSI (0.86 vs 0.74); CIN (0.83 vs 0.73), BRAFmut (0.79 vs 0.66), TP53mut (0·vs 0.64), KRASmut (0.60), CIMP (0.79)This algorithm is based on non-annotated images and uses only slide-level labels to predict the status of CRC pathways and mutations
Kwak et al[110]LNM prediction164 patientsCNN, U-NetAUROC: 67%PTS score is a potential prognostic parameter for LNM in CRC
Pai et al[111]LNM prediction230 patients (training), (136 testing)CNNAUROC: 79%The model allows to identify and quantify a broad spectrum of histological features, including LNM in CRC
Kiehl et al[112]LNM prediction3013 patientsResNET18AUROC: 74.1%DL-based analysis may help predict the LNM of patients with CRC using routine HE-stained slides
Weis et al[120]Tumor Budding (Pan-CK) 381 patients CNN Spatial clusters of tumor buds correlates to N status (P: 0.003)The model is a feasible and valid assessment tool for tumor budding on WSIs and can predict prognosis
Kather et al[121]ADI, DEB, LYM, MUC, SM86 slides (Training), 25 slides (Testing); 862 slide (TCGA-COAD) VGG19AUC: 98.7% HR: 2.29 (OS); 1.92 (RFS); Deep stroma score HR: 1.99 (P: 0.002), Shorter OSThis model can assess the human TME and predict prognosis directly from histopathological images
Shapcott et al[122]TME (EC/IC/FC/MC)853 patches, 142 images (TCGA-COAD)CNNAccuracy: 76% (detection), 65% (classification)The model provides the assessment of TME in CRC slides
Sirinukunwattana et al[123] a-4 tissues classes; b- prediction of DM102 casesSpatially Constrained CNNa-AUROC: 90.4-99.9%; b-AUROC: 58.6-63.8%The algorithm provides a digital marker for estimating the risk of DM
Swiderska-Chadaj et al[124]TME Detection of ICs28 WSIsFCN/LSM/U-NetF1-score of 0.80; Sensitivity: 74%; Precision: 86%DL approaches are reliable for automatically detecting lymphocytes in IHC-stained CRC tissue sections
Geessink et al[115]TSR129 slidesCNN HR: 2.48 (DSS); 2.05 (DFS)CNN defined TSR as an independent prognosticator
Zhao et al[125]TSR499 patients (Discovery cohort); 315 patients (Validation cohort:)CNN TSR, independent prognostic parameter. HRs: 2.48 (Discovery cohort); 2.08 (Validation cohort)CNN allows objective evaluation of TSR
Zhao et al[126]Mucus tumor ratio low vs mucus tumor ratio high814 patientsCNN HRs: 1.88 (Discovery cohort); 2.09 (Validation cohort)The DL quantified mucus tumor ratio is an independent prognostic factor in CRC
Bychkov et al[132]Prognosis LR vs HR420 TMAVGG-16HR: 2.3The model extracts more prognostic information from the tissue morphology than the experienced human observer
Skrede et al[133]Prognosis (CSS)1122 patients (Validation cohort)DoMorev1HRs: 1.89 (uncertain vs good); 3.84 (poor vs good)The digital marker has the potential to identify patients at LR and HR and provides the selection of treatment
Jiang et al[134] a-HRR vs LRR b-Poor vs good prognosis101 patients (Traning); 67 patients (Validation); 47 (TCGA-COAD)InceptionResNetV2a-HRs: 8.98 (training); 10.69 (other 2 test groups); b-HRs: 10.687 (training); 5.03 (other 2 test groups)The selected model offers an independent prognostic predictor which allows stratification of stage III CRC into risk groups
Table 3 Artificial intelligence-based applications in pathology for the determination of tumor behavior in gastric cancer
Ref.
Task
Data sets
Algorithm/Model
Performance
Comments
Yasuda et al[66]NC, GR1, GR2, GR3; PDL-1, ATF7IP/MCAF166 WSIs SV, ML, wndchrmAUCs: 0.98-0.99The model allows grading emphasizing a correlation between molecular expression and tissue structures
Kanavati et al[67]NC, ADC-D, ADC-O1-stage training: 1950 WSIs, 2-stage training: 874 WSIs CNN and RNNAUCs: 0.95-0.99The tool can aid pathologists by potentially accelerating their diagnostic workflow
Fu et al[68]NC, TC, MC, PCTraining 2938 WSIs, Testing 980 WSIs StoHisNetThe accuracy: 94.69%, F1 score: 94.96%, Recall: 94.95%, Precision: 94.97%The model has high performance in the multi-classification on gastric images and shows strong generalization ability on other pathological datasets
Su et al[69]NC, WD, PD, MSS vs MSIGR: Training 348 WSIs, Testing 88 WSIs MSS: Training 212 WSIs, Testing: 52 WSIs, MSI: Training 136 WSIs, Testing: 36 WSIsResNet-18PD vs WD, F1 score: 0.8615, PD vs WD vs NC, F1 score: 0.8977; MSI vs MSS accuracy: 0.7727The proposed system integrated the tumor GR and MSI status recognition problems into the same workflow and was suitable for exploring the relationships between pathological features and molecular status
Muti et al[79]MSI vs MSS; EBV (+) vs EBV (-)2823 patients with known MSI status; 2685 patients with known EBV statusCNN, Shufflenet MSI vs MSS, AUROCs: 0.723-0.863; EBV (+) vs EBV (-), AUROCs: 0.672-0.859 DL-based classifiers have the potential to provide faster decisions for pathologists and to offer therapeutic options tailored to the molecular profile of the individual patient
Kather et al[80] MSI vs MSS Training 81 patients +216 patients (TCGA-STAD)ResNet-18AUC: 0.84This system provides significant improvements in molecular alterations screening workflow
Kather et al[81]EBV (+) vs. EBV (-)Training 317 patients (TCGA-STAD)CNN, VGG19AUC: 0.80 This workflow enables a fast and low-cost method to identify EBV and enables pathologists to check the plausibility of computer-based image classification ( the black box of DL)
Hinata et al[82] EBV+MSI/dMMR vs EBV- non MSI/dMMRUTokyo training cohort: 326 patients; TCGA training cohort: 48 patientsCNNs,VGG16, VGG19, ResNet50, EfficientNetB0AUCs: 0.901–0.992 (Utokyo cohort); AUCs: 0.809–0.931 (TCGA cohort)The model detects immunotherapy-sensitive GC subtypes from histological images at a lower cost and in a shorter time than the conventional methods
Zheng et al[83]EBV (+) vs EBV (-)EBV (+) 203 WSIs; EBV (-) 803 WSIs EBVNetAUROC: 0.969, Internal validation; AUROC: 0.941, External dataset AUROC: 0.895, TCGA datasetThe human-machine fusion significantly improves the diagnostic performance of both the EBVNet and the pathologist, provides an approach for the identification of EBV(+) GC, and may help effectively select patients for immunotherapy
Flinner et al[87]EBV, MSI, GS, CIN Training 84 WSIs (TCGA-STAD); Testing: 133 WSIs (TCGA-STAD)CNN, DenseNet161AUC: 0.76 for four classesThe simplified molecular TCGA and GC subclasses could be predicted by DL directly based on H-E staining
Jang et al[88]CDH1, ERBB2, KRAS, PIK3CA, TP53 mutations425 FF slides (TCGA-STAD); 320 FT slides (TCGA-STAD)CNN, Inception-v3AUCs (FF-FT): CDH1 (0.667-0.778), ERBB2(0.63-0.833), KRAS (0.657-0.838); PIK3CA (0.688-0.761), TP53 (0.572-0.775)When trained with appropriate tissue data, DL could predict genetic mutations in H-E-stained tissue slides
Huang et al[109]Metastatic LNs983 WSIsESCNNAUC: 0.9936ESCNN improves the accuracy of pathologists in identifying metastatic LNs, micrometastases, and isolated tumor cells, allowing for shortening the review time
Hu et al[107]Metastatic LNs222 patients RCNN, Xception and DenseNet-121Accuracy 97.13%; PPV: 93.53, NPV: 97.99%The system can be implemented into clinical workflow to assist pathologists in preliminary screening for LN metastases in GC patients
Matsushima et al[108]Metastatic LNs827 lymph nodesCNNAUROC: 0.9994 This DL-based diagnosis-aid system can assist pathologists in detecting LN metastasis in GC and reduce their workload
Wang et al[106]Metastatic LNs, T/LNM9366 slides (7736 with metastasis)Resnet-50LNM (+) vs (-): Sensitivity 98.5%, Specificity 96.1%; T/LNM: HR: 2.05 (univariate analysis); 1.39 (multivariate analysis)This system can assist pathologists in detecting LN metastasis in GC and reduce their workload. Besides, T/LNM is prognostic of OS in GC patients
Hong et al[116]dTSR (HE and CK7)Training 13 WSIs; Testing 358 WSIscGAN Kappa value: 0.623 (dTSR and vTSR); AUROC: 0.907; OS (P: 0.0024)By diagnosing TSR in GC, this model predicts OS in the advanced stage of GC
Meier et al[127]TME + Ki-67248 patients CNNHRs: Ki67&CD20: 1.364, CD20&CD68: 1.338; Ki67&CD68: 1.473In combination with a panel of IHC markers, this model predicts the prognosis of patients with GC
Huang et al[128]OSTraining: 2261 pictures; Internal validation: 960 picturesGastroMILHR: 2.414 (univariate analysis), 1.843 (multivariate analysis)The risk score computed by MIL-GC was proved to be the independent prognostic value of GC
Jiang et al[129]5-YS, 5-YDFS786 patientsML, SVMAUCs: 5-YS: 0.834; 5-YDFS: 0.828The classifier can accurately distinguishes GC patients with different OS and DFS and identifies a subgroup of patients with stage II and III disease who could benefit from adjuvant chemotherapy
Jiang et al[130]Low SVM vs High SVM, 5-YS, 5-YDFSTraining: 223 patients; Internal validation: 218 patientsExternal validation: 227 patientsML, SVMAUCs: 5-YS: 0.818; 5-YDFS: 0.827SVM signature distinguish GC patients with different OS and DFS and identifies a subgroup of patients with stage II and III disease who could benefit from adjuvant chemotherapy
Wang et al[131]TME172 patientsCGSignature powered by AIAUROCs: 0.960 ± 0.01 (binary classification), 0.771 ± 0.024 to 0.904 ± 0.012 (ternary classification)Digital grade cancer staging produced by CGSignature predicts the prognosis of GC and significantly outperforms the AJCC 8th edition Tumor Node Metastasis staging system