Retrospective Study
Copyright ©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. May 21, 2025; 31(19): 105283
Published online May 21, 2025. doi: 10.3748/wjg.v31.i19.105283
Serum calcium-based interpretable machine learning model for predicting anastomotic leakage after rectal cancer resection: A multi-center study
Bo-Yu Kang, Yi-Huan Qiao, Jun Zhu, Bao-Liang Hu, Ze-Cheng Zhang, Ji-Peng Li, Yan-Jiang Pei
Bo-Yu Kang, Yi-Huan Qiao, Jun Zhu, Ze-Cheng Zhang, Ji-Peng Li, Department of Digestive Surgery, Xijing Hospital of Digestive Diseases, Xi’an 710032, Shaanxi Province, China
Jun Zhu, Department of General Surgery, The Southern Theater Air Force Hospital, Guangzhou 510000, Guangdong Province, China
Bao-Liang Hu, Yan'an Medical College, Yan'an University, Yan’an 716000, Shaanxi Province, China
Ji-Peng Li, Department of Experiment Surgery, Xijing Hospital, Xi’an 710032, Shaanxi Province, China
Yan-Jiang Pei, Department of Digestive Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi’an 710032, Shanxi Province, China
Co-first authors: Bo-Yu Kang and Yi-Huan Qiao.
Co-corresponding authors: Ji-Peng Li and Yan-Jiang Pei.
Author contributions: Kang BY and Qiao YH contributed equally to this study as co-first authors; Li JP and Pei YJ contributed equally to this study as co-corresponding authors; Kang BY was responsible for study conceptualization and design, data acquisition, analysis, and interpretation, and manuscript drafting, review, and editing; Qiao YH was responsible for study conceptualization and design, data acquisition, analysis, and interpretation, and manuscript review and editing; Zhu J was responsible for data acquisition, analysis, and interpretation, statistical analysis, and manuscript review and editing; Hu BL was responsible for data analysis and interpretation, and statistical analysis. Li JP and Pei YJ were responsible for manuscript review and editing.
Supported by National Natural Science Foundation of China, No. 82172781; and Shaanxi Health Scientific Research Innovation Team Project, No. 2024TD-06.
Institutional review board statement: The previous study was approved by the ethics committee of the First Affiliated Hospital of Air Force Military Medical University (approval No. KY20212211-N-1).
Informed consent statement: This study was a secondary analysis of retrospective data and was a retrospective, multi-cohort, observational study using de-identified data. Therefore, consent and research ethics committee approval was not required.
Conflict-of-interest statement: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Data sharing statement: The data that support the findings of this study are available from the corresponding author upon reasonable request.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Yan-Jiang Pei, MD, PhD, Professor, Department of Digestive Surgery, Honghui Hospital, Xi'an Jiaotong University, No. 555 Youyi East Road, Beilin District, Xi’an 710032, Shanxi Province, China. 15829329200@126.com
Received: January 17, 2025
Revised: March 27, 2025
Accepted: April 27, 2025
Published online: May 21, 2025
Processing time: 124 Days and 11.4 Hours
Abstract
BACKGROUND

Despite the promising prospects of utilizing artificial intelligence and machine learning (ML) for comprehensive disease analysis, few models constructed have been applied in clinical practice due to their complexity and the lack of reasonable explanations. In contrast to previous studies with small sample sizes and limited model interpretability, we developed a transparent eXtreme Gradient Boosting (XGBoost)-based model supported by multi-center data, using patients' basic information and clinical indicators to forecast the occurrence of anastomotic leakage (AL) after rectal cancer resection surgery. The model demonstrated robust predictive performance and identified clinically relevant thresholds, which may assist physicians in optimizing perioperative management.

AIM

To develop an interpretable ML model for accurately predicting the occurrence probability of AL after rectal cancer resection and define our clinical alert values for serum calcium ions.

METHODS

Patients who underwent anterior resection of the rectum for rectal carcinoma at the Department of Digestive Surgery, Xijing Hospital of Digestive Diseases, Air Force Medical University, and Shaanxi Provincial People's Hospital, were retrospectively collected from January 2011 to December 2021,. Ten ML models were integrated to analyze the data and develop the predictive models. Receiver operating characteristic (ROC) curves, calibration curve, decision curve analysis, accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score were used to evaluate model performance. We employed the SHapley Additive exPlanations (SHAP) algorithm to explain the feature importance of the optimal model.

RESULTS

A total of ten features were integrated to construct the predictive model and identify the optimal model. XGBoost was considered the best-performing model with an area under the ROC curve (AUC) of 0.984 (95%confidence interval: 0.972-0.996) in the test set (accuracy: 0.925; sensitivity: 0.92; specificity: 0.927). Furthermore, the model achieved an AUC of 0.703 in external validation. The interpretable SHAP algorithm revealed that the serum calcium ion level was the crucial factor influencing the predictions of the model.

CONCLUSION

A superior predictive model, leveraging clinical data, has been crafted by employing the most effective XGBoost from a selection of ten algorithms. This model, by predicting the occurrence of AL in patients after rectal cancer resection, has identified the significant role of serum calcium ion levels, providing guidance for clinical practice. The integration of SHAP provides a clear interpretation of the model's predictions.

Keywords: Machine learning; Rectal cancer; Anastomotic leakage; SHapley Additive exPlanations algorithms

Core Tip: Ten machine learning models were established using ten factors and interpreted using the SHapley Additive exPlanations model. Through model evaluation and comparison, we selected the best prediction model and performed external validation in multiple centers. We found for the first time that perioperative serum calcium ion level plays an important role in the occurrence of anastomotic leakage (AL) after anterior resection of rectal cancer, and proposed that preoperative serum calcium level lower than 2.1 and postoperative calcium level lower than 2.2 are clinical warning values for the occurrence of AL.