Artificial intelligence applications in inflammatory bowel disease: Emerging technologies and future directions

doi:10.3748/wjg.v27.i17.1920

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 27, Issue 17

This Article

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (13820)

All Articles published online

The chart showing PDF series, WORD series, HTML series, Tables (1-3) series.

Item

Count

PDF

1147

WORD

222

HTML

10323

Tables (1-3)

585

Sum=12277

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

484

Download

1059

Sum=1543

May 7, 2021 (publication date) through Aug 16, 2025

Times Cited of This Article

Times Cited (89)

Journal Information of This Article

Publication Name

World Journal of Gastroenterology

ISSN

1007-9327

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Minireviews

World J Gastroenterol. May 7, 2021; 27(17): 1920-1935
Published online May 7, 2021. doi: 10.3748/wjg.v27.i17.1920

Table 1 Artificial intelligence in diagnosis and risk prediction of inflammatory bowel disease

Ref.	AI classifier vs comparator	IBD type	Study design and sample size	Modality	Outcome	Study results/validation cohort
Mossotto et al[18], 2017	Support vector machines (SVM) vs linear discriminant	Peds CD/UC	Prospective cohort, 287 IBD patients	Endoscopic and histologic inflammation	Diagnosis of IBD	Diagnostic accuracy of 82.7% with an AUC of 0.87 in diagnosing Crohn's disease or ulcerative colitis. Validation cohort included
Wei et al[19], 2013	SVM with gradient boosted trees (GBT) vs simple log odds method	CD/UC	Cross-sectional, 30000 IBD patients, 22000 healthy controls	Genetics, ImmunoChip	Risk of IBD	The SVM demonstrated very comparable performance (AUC 0.862 and 0.826 for CD and UC, respectively), whereas GBT showed inferior performance (AUC 0.802 and0.782 for CD and UC, respectively. Validation cohort included
Romagnoni et al[20], 2019	Artificial neural networks (ANNs) vs penalized logistic regression (LR), and GBT	CD	Cross-sectional, 18227 CD patients, 34050 healthy controls	Genetics, ImmunoChip	Risk of IBD	Using single nucleotide polymorphisms (SNPs), final predictive model achieved AUC of 0.80. Validation cohort included
Isakov et al[21], 2017	Random forest (RF), SVM with svmPoly), extreme gradient boosting vs elastic net regularized generalized linear model (glmnet)	CD/UC	Cross-sectional, 180 CD patients, 149 UC patients, 90 healthy controls	Expression data (microarray and RNA-seq)	Risk of IBD	The method was used to classify a list of 16390 genes. Each gene received a score that was used to prioritize it according to its predicted association to IBD. The combined model demonstrated AUC, sensitivity, specificity, and accuracy values of 0.829, 0.577, 0.88, and 0.808, respectively. Validation cohort included
Yuan et al[22], 2017	Sequential minimal optimization vs DisGeNET (Version 4.0)	CD/UC	Cross-sectional, 59 CD patients, 26 UC patients, 42 healthy controls	Gene Expression datasets	Risk of IBD	By analyzing the gene expression profiles using minimum redundancy maximum relevance and incremental feature selection, 21 genes were obtained that could effectively distinguish samples from IBD and the non-IBD samples. Highest total prediction accuracy was 97.64% using the 1170^th feature set. Validation cohort included
Hübenthal et al[23], 2015	SVM vs RF	CD/UC	Cross-sectional, 40 CD patients, 36 UC patients, 38 healthy controls	MicroRNAs	Diagnosis of IBD	Measured by the AUC the corresponding median holdout-validated accuracy was estimated as ranging from 0.75 to 1.00 and 0.89 to 0.98, respectively. In combination, the corresponding models provide tools for the distinction of CD and UC as well as CD, UC and healthy control with expected classification error rates of 3.1 and 3.3%, respectively. Validation cohort included
Tong et al[24], 2020	RF vs convolutional neural network (CNN)	CD/UC	Retrospective Cohort, 875 CD patients, 5128 UC patients	Colonoscopy Endoscopic Images	Diagnosis of IBD	RF sensitivities/specificities of UC/CD were 0.89/0.84, 0.83/0.82, and 0.72/0.77, respectively, while the values for the CNN of CD was 0.90/0.77. The precisions/recalls of UC-CD when employing RF were 0.97/0.97, 0.65/0.53, respectively, and when employing the CNN were 0.99/0.97 and 0.87/0.83, respectively. Validation cohort included
Smolander et al[25], 2019	Deep belief networks (DBNs) vs SVM	CD/UC	Cross-sectional, 59 CD patients, 26 UC patients, 42 healthy controls	Gene Expression datasets	Diagnosis of IBD	Using DBN only, accuracy for diagnosis of UC was 97.06% and CD was 97.07%. Using both DBN and SVM, accuracy for diagnosis of UC was 97.06% and CD was 97.03%. Validation cohort included
Abbas et al[26], 2019	RF vs network-based biomarker discovery	Peds CD/UC	Cross-sectional, 657 IBD patients, 316 healthy controls	Large dataset of new-onset pediatric IBD metagenomics biopsy samples	Diagnosis of IBD	For the diagnosis of IBD, highest AUC attained by top Random Forest classifiers was 0.77. No validation cohort included
Khorasani et al[27], 2020	SVM vs recently developed feature selection algorithm (robustness-performance tradeoff, RPT)	UC	Cross-sectional, 146 UC patients, 60 healthy controls	Gene Expression dataset	Diagnosis of IBD	Our model perfectly detected all active cases and had an average precision of 0.62 in the inactive cases. Validation cohort included
Rubin et al[28], 2019	CITRUS supervised machine learning algorithm. No comparator	CD/UC	Cross-sectional, 68 IBD patients	Peripheral blood mononuclear cells and intestinal biopsies mass cytometry	Diagnosis of IBD	An 8-parameter immune signature distinguished Crohn's disease from ulcerative colitis with an AUC = 0.845 (95%CI: 0.742-0.948). No validation cohort included
Pal et al[29], 2017	Naïve Bayes and with a consensus machine learning method vs Critical Assessment of Genome Interpretation (CAGI) 4 method	CD	Cross-sectional, 64 CD patients, 47 healthy controls	Genotypes from Exome Sequencing Data	Risk of IBD	The AUC for predicting risk of Crohn's disease using the SNP model was 0.72. No validation cohort included
Aoki et al[30], 2019	Deep CNN. No comparator	CD	Retrospective Cohort, 115 IBD patients	Wireless capsule endoscopy images	Diagnosis of IBD	The AUC for the detection of erosions and ulcerations was 0.958 (95%CI: 0.947-0.968). The sensitivity, specificity, and accuracy of the CNN were 88.2% (95%CI: 84.8-91.0), 90.9% (95%CI: 90.3-91.4), and 90.8% (95%CI: 90.2-91.3), respectively. Validation cohort included
Bielecki et al[31], 2012	SVM vs human reader (pathologist)	CD/UC	Cross-sectional, 14 CD patients, 13 UC patients, 11 healthy controls	Raman spectroscopic imaging of epithelium cells	Diagnosis of IBD	Raman maps of human colon tissue sections were analyzed by utilizing innovative chemometric approaches. Using SVM, it was possible to separate between healthy control patients, patients with Crohn's Disease, and patients with ulcerative colitis with an accuracy of 98.90%. No validation cohort included
Cui et al[32], 2013	Recursive SVM vs unsupervised learning strategy	CD/UC	Cross-sectional, 124 IBD patients, 99 healthy controls	16S rRNA gene analysis	Diagnosis of IBD	Selection level of 200 features results in the best leave-one-out cross-validation result (accuracy = 88%, sensitivity = 92%, specificity = 84%). Validation cohort included
Duttagupta et al[33], 2012	SVM. No comparator	UC	Cross-sectional, 20 UC patients, 20 healthy controls	MicroRNAs	Diagnosis of IBD	SVM classifier measurements revealed a predictive score of 92.8% accuracy, 96.2% specificity and 89.5% sensitivity in distinguishing ulcerative colitis patients from normal individuals. Validation cohort included
Daneshjou et al[34], 2017	Naïve bayes, neural networks, random forests vs CAGI methods	CD	Cross-sectional, 64 ICD patients, 47 healthy controls	Exome Sequencing	Diagnosis of IBD	In CAGI4, 111 exomes were derived from a mix of 64 Crohn’s disease patients. Top performing methods had an AUC of 0.87. Validation cohort included
Geurts et al[35], 2005	RF vs SVM	CD/UC	Prospective cohort, 30 CD patients, 30 CD patients	Proteomic Mass Spectrometry	Diagnosis of IBD	Random forest model to diagnosis IBD had a sensitivity of 81.67%, specificity of 81.17%. Support vector machine model to diagnosis IBD had a sensitivity of 87.92%, specificity of 87.87%. Validation cohort included
Li et al[36], 2020	RF vs ANN	UC	Cross-sectional, 193 UC patients, 21 healthy controls	Gene Expression Profiles	Diagnosis of IBD	The random forest algorithm was introduced to determine 1 downregulated and 29 upregulated differentially expressed genes contributing highest to ulcerative colitis occurrence. ANN was developed to calculate differentially expressed genes weights to ulcerative colitis. Prediction results agreed with that of an independent data set (AUC = 0.9506/PR-AUC = 0.9747). Validation cohort included
Wingfield et al[37], 2019	RF vs SVM	CD	Cross-sectional, 668 CD patients	Metagenomic Data	Diagnosis of IBD	Highest RPT measure for Crohn’s disease was random forest 0.60 and SVM 0.58. For ulcerative colitis, RPT was random forest 0.70 and SVM 0.48. Validation cohort included
Han et al[38], 2018	RF vs LR, CORG	CD/UC	Cross-sectional, 24 CD patients, 59 UC patients, 76 healthy controls	Gene Expression Profiles	Diagnosis of IBD	The gene-based feature sets had median AUC on the validation sets ranging from 0.6 to 0.76). Validation cohort included
Wang et al[39], 2019	AVADx (Analysis of Variation for Association with Disease) vs two GWAS-based CD evaluation methods	CD	Cross-sectional, 64 CD patients, 47 healthy controls	Whole Exome or Genome Sequencing Data	Diagnosis of IBD	AVADx highlighted known CD genes including NOD2and new potential CD genes. AVADx identified 16% (at strict cutoff) of CD patients at 99% precision and 58% of the patients (at default cutoff) with 82% precision in over 3000 individuals from separately sequenced panels. Validation cohort included

AI: Artificial intelligence; IBD: Inflammatory bowel disease; CD: Crohn’s disease; UC: Ulcerative colitis; AUC: Area under the curve.

Table 2 Artificial Intelligence in assessment of disease severity in inflammatory bowel disease

Ref.	AI classifier vs comparator	IBD type	Study design and sample size	Modality	Outcomes	Study results/validation cohort
Kumar et al[40], 2012	Support vector machines (SVM) vs human observers	CD	Cross-sectional, 50000 images (number of patients not given)	Small bowel capsule endoscopy	Endoscopic Inflammation	Database of 47 studies including 50000 capsule endoscopy images evaluating severity of small bowel lesions. Method had good precision (> 90% for lesion detection) and recall (> 90%) for lesions of varying severity. Validation cohort included
Biasci et al[41], 2019	Logistic regression with an adaptive Elastic-Net penalty. No comparator	CD/UC	Prospective cohort, 118 IBD patients	Transcriptomics from purified CD8 T cells and/or whole blood	Disease severity, medication escalation	A 17-gene qPCR-based classifier stratified patients into two distinct subgroups. IBDhi patients experienced significantly more aggressive disease than IBDlo patients (analogous to IBD2), with earlier need for treatment escalation [HR 2.65 (CD), 3.12 (UC)] and more escalations over time [for multiple escalations within 18 months: sensitivity=72.7% (CD), 100% (UC); negative predictive value = 90.9% (CD), 100% (UC)]. Validation cohort included
Waljee et al[42], 2019	RF. No comparator	CD	Post-hoc analysis of prospective clinical trials, 401 CD patients	Clinical and laboratory data from publicly available clinical trials (UNITI-1, UNITI-2, and IM-UNITI)	Crohn's disease remission, C-reactive protein < 5 mg/L	A prediction model using the week-6 albumin to C-reactive protein ratio had an AUC of 0.76 [95% confidence interval (CI): 0.71-0.82]. Validation cohort included
Mahapatra et al[43], 2016	RF. No comparator	CD	Cross-sectional, 35 CD patients	Abdominal magnetic resonance imaging	Segmentation of diseased colon (intestinal inflammation)	Model segmentation accuracy ranged from 82.7% to 92.2%. Validation cohort included
Reddy et al[44], 2019	Gradient boosting machines vs logistic regression	CD	Retrospective, 3335 CD patients	Electronic medical record	Severity of intestinal inflammation (by C-reactive protein)	Machine-learning-based analytic methods such as gradient boosting machines can predict the inflammation severity with a very high accuracy (AUC) = 92.82%. Validation cohort included
Douglas et al[45], 2018	RF. No comparator	Peds CD	Cross-sectional, 20 CD patients, 20 healthy controls	Shotgun metagenomics (MGS), 16S rRNA gene sequencing	Disease State (Relapse/Remission)	MGS modules significantly classified samples by disease state (accuracy = 68.4%, P = 0.043 and accuracy = 65.8%, P = 0.03, respectively), 16S datasets had a maximum accuracy of 68.4% and P = 0.016 based on strain level for disease state. Validation cohort included
Maeda et al[46], 2019	SVM vs human reader	UC	Retrospective cohort, 187 UC patients	Endocytoscopy	Histologic inflammation	Computer aided diagnosis (CAD) of histologic inflammation provided diagnostic sensitivity, specificity, and accuracy as follows: 74% (95%CI: 65-81), 97% (95%CI: 95-99), and 91% (95%CI: 83-95), respectively. Its reproducibility was perfect (k = 1). Validation cohort included
Charisis et al[47], 2016	SVM vs human reader	CD	Retrospective cohort, 13 CD patients	Wireless capsule endoscopy (WCE) images	Endoscopic Inflammation	Experimental results, along with comparison with other related efforts, have shown that the hybrid adaptive filtering [HAF-Differential Lacunarity (DLac) analysis (HAF-DLac)] via SVM approach evidently outperforms them in the field of WCE image analysis for automated lesion detection, providing higher classification results, up to 93.8% (accuracy), 95.2% (sensitivity), 92.4% (specificity) and 92.6% (precision). Validation cohort included
Klang et al[48], 2020	Convolutional neural network (CNN) vs human reader	CD	Retrospective cohort, 49 CD patients	WCE images	Endoscopic Inflammation	Dataset included 17640 CE images from 49 patients: 7391 images with mucosal ulcers and 10249 images of normal mucosa. For randomly split images results, AUC was 0.99 with accuracies ranging from 95.4% to 96.7%. For individual patient-level experiments, the AUCs were 0.94-0.99. Validation cohort included
Ungaro et al[49], 2021	Random survival forest. No comparator	Peds CD	Retrospective case-control, 265 peds CD patients	Protein biomarkers using a proximity extension assay (Olink Proteomics)	Penetrating and stricturing complications	A model with 5 protein markers predicted penetrating complications with an AUC of 0.79 (95%CI: 0.76-0.82) compared to 0.69 (95%CI: 0.66-0.72) for serologies and 0.74 (95%CI: 0.71-0.77) for clinical variables. A model with 4 protein markers predicted structuring complications with an AUC of 0.68 (95%CI: 0.65-0.71) compared to 0.62 (95%CI: 0.59-0.65) for serologies and 0.52 (95%CI: 0.50-0.55) for clinical variables. Validation cohort included
Barash et al[50], 2021	Ordinal CNN. No comparator	CD	Retrospective cohort, 49 CD patients	WCE images	Ulcer Severity Grading	The classification accuracy of the algorithm was 0.91 (95%CI: 0.867-0.954) for grade 1 vs grade 3 ulcers, 0.78 (95%CI: 0.716-0.844) for grade 2 vs grade 3, and 0.624 (95%CI: 0.547-0.701) for grade 1 vs grade 2. Validation cohort included
Lamash et al[51], 2019	CNN vs semi-supervised and active learning models	CD	Retrospective cohort, 23 CD patients	Magnetic resonance imaging	Active Crohn’s Disease	CNN exhibited Dice similarity coefficient of 75% ± 18%, 81% ± 8%, and 97% ± 2% for the lumen, wall, and background, respectively. The extracted markers of wall thickness at the location of min radius (P = 0.0013) and the median value of relative contrast enhancement (P = 0.0033) could differentiate active and nonactive disease segments. Other extracted markers could differentiate between segments with strictures and segments without strictures (P < 0.05). Validation cohort included
Takenaka et al[52], 2020	Deep neural networks vs human reader (endoscopist)	UC	Prospective cohort, 2012 UC patients	Colonoscopy images	Endoscopic inflammation	Deep neural network identified patients with endoscopic remission with 90.1% accuracy (95%CI: 89.2-90.9) and a kappa coefficient of 0.798 (95%CI: 0.780-0.814), using findings reported by endoscopists as the reference standard. Validation cohort included
Bossuyt et al[53], 2020	Computer algorithm based on red density (RD) vs blinded central readers	UC	Prospective cohort, 29 UC patients, 6 healthy controls	Colonoscopy Images	Endoscopic and histologic inflammation	In the construction cohort, RD correlated with rhi (r = 0.74, P < 0.0001), Mayo endoscopic subscores (r = 0.76, P < 0.0001) and Endoscopic index of severity scores (r = 0.74, P < 0.0001). The RD sensitivity to change had a standardized effect size of 1.16. in the validation set, RD correlated with rhi (r = 0.65, P = 0.00002). Validation cohort included
Bhambhvani et al[54], 2021	CNN vs human reader (endoscopist)	UC	Retrospective cohort, 777 UC patients	Colonoscopy images	Mayo Endoscopic Scores (MES)	The final model classified MES 3 disease with an AUC of 0.96, MES 2 disease with an AUC of 0.86, and MES 1 disease with an AUC 0.89. Overall accuracy was 77.2%. Across MES 1, 2, and 3, average specificity was 85.7%, average sensitivity was 72.4%, average PPV was 77.7%, and the average NPV was 87.0%. Validation cohort included
Ozawa et al[55], 2019	CNN vs human reader (endoscopist)	UC	Retrospective cohort, 841 UC patients	Colonoscopy images	MES	The CNN-based CAD system showed a high level of performance with AUC of 0.86 and 0.98 to identify Mayo 0 and 0-1, respectively. The performance of the CNN was better for the rectum than for the right side and left side of the colon when identifying Mayo 0 (AUC = 0.92, 0.83, and 0.83, respectively). Validation cohort included
Bossuyt et al[56], 2021	Automated CAD Algorithm vs human reader	UC	Prospective cohort, 48 UC patients	Colonoscopy images with confocal laser endomicroscopy	Histologic Remission	The current automated CAD algorithm detects histologic remission with a high performance (sensitivity of 0.79 and specificity of 0.90) compared with the UCEIS (sensitivity of 0.95 and specificity of 0.69) and MES (sensitivity of 0.98 and specificity of 0.61). No validation cohort included
Stidham et al[57], 2019	CNN vs human reader	UC	Retrospective cohort, 3082 UC patients	Colonoscopy images	Endoscopy severity	The CNN was excellent for distinguishing endoscopic remission from moderate-to-severe disease with an AUC of 0.966 (95%CI: 0.967-0.972); a PPV of 0.87 (95%CI: 0.85-0.88) with a sensitivity of 83.0% (95%CI: 80.8-85.4) and specificity of96.0% (95%CI: 95.1-97.1); and NPV of 0.94 (95%CI: 0.93-0.95). No validation cohort included
Gottlieb et al[58], 2021	Neural network vs human central reader	UC	Prospective cohort, 249 UC patients	Colonoscopy images	Endoscopy severity	The model's agreement metric was excellent, with a quadratic weighted kappa of 0.844 (95%CI: 0.787-0.901) for endoscopic Mayo Score and 0.855 (95%CI: 0.80-0.91) for UCEIS. No validation cohort included

AI: Artificial intelligence; IBD: Inflammatory bowel disease; CD: Crohn’s disease; UC: Ulcerative colitis; AUC: Area under the curve; NPV: Negative predictive value; PPV: Positive predictive value; qPCR: Quantitative real-time polymerase chain reaction; HR: Hazard ratio.

Table 3 Artificial intelligence in prediction of therapy response and clinical outcomes in inflammatory bowel disease

Ref.	AI classifier vs comparator	IBD type	Study design and sample size	Modality	Outcomes	Study results/validation cohort
Waljee et al[59], 2018	Random forest (RF). No comparator	CD/UC	Post-hoc analysis of prospective clinical trial, 594 CD patients	Veteran’s Health Administration Electronic Health Record (EHR)	Outpatient corticosteroids prescribed for IBD and inpatient hospitalizations associated with a diagnosis of IBD	AUC for the RF longitudinal model was 0.85 [95% confidence interval (CI): 0.84–0.85]. AUC for the RF longitudinal model using previous hospitalization or steroid use was 0.87 (95%CI: 0.87-0.88). Validation cohort included
Uttam et al[60], 2019	Support vector machines (SVM) vs nanoscale nuclear architecture mapping (NanoNAM)	CD/UC	Prospective cohort, 103 IBD patients	3-dimensional NanoNAM of normal-appearing rectal biopsies	Colonic neoplasia	NanoNAM detects colonic neoplasia with an AUC of 0.87 ± 0.04, sensitivity of 0.81 ± 0.09, and specificity of 0.82 ± 0.07 in the independent validation set. Validation cohort included
Waljee et al[61], 2017	RF. No comparator	CD/UC	Retrospective cohort, 1080 IBD patients	EHR, lab values	Remission and clinical outcomes with thiopurines	AUC for algorithm-predicted remission in the validation set was 0.79 vs 0.49 for 6-TGN. The mean number of clinical events per year in patients with sustained algorithm-predicted remission (APR) was 1.08 vs 3.95 in those that did not have sustained APR (P < 1 × 10^-5). Validation cohort included
Popa et al[62], 2020	Neural network model. No comparator	UC	Prospective cohort, 55 UC patients	Clinical and biological parameters and the endoscopic Mayo score	Disease activity after one year of anti-TNF treatment	The classifier achieved an excellent performance predicting the disease activity at one year with an accuracy of 90% and AUC 0.92 on the test set and an accuracy of 100% and an AUC of 1 on the validation set. Validation cohort included
Douglas et al[45], 2018	RF. No comparator	Peds CD	Cross-sectional, 20 CD patients, 20 healthy controls	Shotgun metagenomics (MGS), 16S rRNA gene sequencing	Response to induction therapy	16S genera were again the top dataset (accuracy = 77.8%; P = 0.008) for predicting response to therapy. MGS strain (P = 0.029), genus (P = 0.013), and KEGG pathway (P = 0.018) datasets could also classify patients according to therapy response with accuracy = 72.2% for all three. Validation cohort included
Waljee et al[63], 2010	RF vs boosted trees, RuleFit	CD/UC	Cross-sectional, 774 IBD patients	EHR, lab values (thiopurine metabolites)	Response to thiopurine therapy	A RF algorithm using laboratory values and patient age differentiated clinical response from nonresponse in the model validation data set with an AUC of 0.856 (95%CI: 0.793-0.919). Validation cohort included
Menti et al[64], 2016	Naïve bayes vs Bayesian additive regression trees vs Bayesian networks	CD/UC	Retrospective cohort, 152 CD patients	Genomic DNA, genetic polymorphism	Presence of extra-intestinal manifestations in IBD patients	Bayesian networks achieved accuracy of 82% when considering only clinical factors and 89% when considering also genetic information, outperforming the other techniques. Validation cohort included
Waljee et al[65], 2017	RF vs baseline regression model	CD/UC	Retrospective cohort, 20368 IBD patients	EHR, lab values	Corticosteroid-free biologic remission with vedolizumab	The AUC for corticosteroid-free biologic remission at week 52 using baseline data was only 0.65 (95%CI: 0.53-0.77), but was 0.75 (95%CI: 0.64-0.86) with data through week 6 of vedolizumab. Validation cohort included
Morilla et al[66], 2019	Deep neural networks. No comparator	UC	Retrospective cohort, 47 UC patients	Colonic microrna profiles	Responses to therapy	A deep neural network-based classifier identified 9 microRNAs plus 5 clinical factors, routinely recorded at time of hospital admission, that were associated with responses of patients to treatment. This panel discriminated responders to steroids from non-responders with 93% accuracy (AUC, 0.91). Three algorithms, based on microRNA levels, identified responders to infliximab vs non-responders (84% accuracy, AUC 0.82) and responders to cyclosporine vs non-responders (80% accuracy, AUC 0.79). Validation cohort included
Wang et al[67], 2020	Back-propagation neural network (BPNN), SVM vs logistic regression	CD	Cross-sectional, 446 CD patients	EHR	Medication nonadherence to maintenance therapy	The average classification accuracy and AUC of the three models were 85.9% and 0.912 for BPNN, and 87.7% and 0.930 for SVM, respectively. Validation cohort included
Bottigliengo et al[68], 2019	Bayesian machine learning techniques (BMLTs) vs logistic regression	CD/UC	Retrospective cohort, 142 IBD patients	EHR, genetic polymorphisms	Presence of extra-intestinal manifestations in IBD patients	BMLTs had an AUC of 0.50 for classifying the presence of extra-intestinal manifestations. Validation cohort included
Ghoshal et al[69], 2020	Nonlinear artificial neural network (ANN) vs multivariate linear PCA	UC	Prospective cohort, 263 UC patients	EHR	Responses to therapy	The multilayer perceptron neural network was trained by back-propagation algorithm (10 networks retained out of 16 tested). The classification accuracy rate was 73% in correctly classifying response to medical treatment in UC patients. No validation cohort included
Sofo et al[70], 2020	SVM leave-one-out cross-validation. No comparator	UC	Retrospective cohort, 32 UC patients	EHR	Post-surgical complications after colectomy	Evaluating only preoperative features, machine learning algorithms were able to predict minor postoperative complications with a high strike rate (84.3%), high sensitivity (87.5%) and high specificity (83.3%) during the testing phase. Validation cohort included
Kang et al[71], 2017	ANN vs logistic regression	UC	Cross-sectional, 24 UC patients	Gene expression profiles	Response to anti-TNF	Balanced accuracy in cross validation test for predicting response to anti-TNF therapy in ulcerative colitis patient was 82%. Validation cohort included
Babic et al[72], 1997	CART vs back propagation neural network (BPNN)	CD/UC	Cross-sectional, 200 IBD patients	EHR	Quality of life	Best reached classification accuracy did not exceed 80% in any case. Other classifiers namely, K-nearest-neighbor, learning vector quantization and BPNN confirmed that outcome. Validation cohort included
Dong et al[73], 2019	RF, SVM, ANN vs logistic regression	CD	Retrospective cohort, 239 CD patients	EHR, laboratory tests	Crohn's related surgery	The results revealed that RF predictive model performed better than LR model in terms of accuracy (93.11% vs 91.15%), precision (53.42% vs 44.81%), F1 score (0.6016 vs 0.5763), TN rate (95.08% vs 92.00%), and the AUC (0.8926 vs 0.8809). The AUCs were excellent at 0.9864 in RF,0.9538 in LR, 0.8809 in DT, 0.9497 in SVM, and 0.9059 in ANN, respectively. Validation cohort included
Lerrigo et al[74], 2019	Latent Dirichlet allocation, unsupervised machine learning algorithm. No comparator	CD/UC	Retrospective cohort, 28623 IBD patients	Online posts from the Crohn’s and colitis foundation community forum	Impact of online community forums on well-being and their emotional content	10702 (20.8%) posts were identified expressing: gratitude (40%), anxiety/fear (20.8%), empathy (18.2%), anger/frustration (13.4%), hope (13.2%), happiness (10.0%), sadness/depression (5.8%), shame/guilt (2.5%), and/or loneliness (2.5%). A common subtheme was the importance of fostering social support. No validation cohort included

AI: Artificial intelligence; IBD: Inflammatory bowel disease; CD: Crohn’s disease; UC: Ulcerative colitis; AUC: Area under the curve; TNF: Tumor necrosis factor.

Citation: Gubatan J, Levitte S, Patel A, Balabanis T, Wei MT, Sinha SR. Artificial intelligence applications in inflammatory bowel disease: Emerging technologies and future directions. World J Gastroenterol 2021; 27(17): 1920-1935
URL: https://www.wjgnet.com/1007-9327/full/v27/i17/1920.htm
DOI: https://dx.doi.org/10.3748/wjg.v27.i17.1920