Retrospective Study
Copyright ©The Author(s) 2023. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. Jun 28, 2023; 29(24): 3855-3870
Published online Jun 28, 2023. doi: 10.3748/wjg.v29.i24.3855
Comparison and development of machine learning for thalidomide-induced peripheral neuropathy prediction of refractory Crohn’s disease in Chinese population
Jing Mao, Kang Chao, Fu-Lin Jiang, Xiao-Ping Ye, Ting Yang, Pan Li, Xia Zhu, Pin-Jin Hu, Bai-Jun Zhou, Min Huang, Xiang Gao, Xue-Ding Wang
Jing Mao, Fu-Lin Jiang, Ting Yang, Pan Li, Bai-Jun Zhou, Min Huang, Xue-Ding Wang, Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
Jing Mao, Ting Yang, Pan Li, Min Huang, Xue-Ding Wang, Guangdong Provincial Key Laboratory of New Drug Design and Evaluation, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
Kang Chao, Xia Zhu, Pin-Jin Hu, Xiang Gao, Department of Gastroenterology, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou 510006, Guangdong Province, China
Xiao-Ping Ye, Department of Pharmacy, Guangdong Women and Children Hospital, Guangzhou 510000, Guangdong Province, China
Author contributions: Mao J, Chao K, Zhu X, and Wang XD designed the study; Mao J and Zhou BJ performed the experiments; Chao K, Gao X, Yang T and Li P enrolled the patients and collected the clinical data; Mao J and Ye XP performed the machine learning analyses; Huang M and Hu PJ supervised the study; Mao J and Wang XD wrote the manuscript.
Supported by National Natural Science Foundation of China, No. 81973398, No. 81730103, No. 81573507 and No. 82020108031; The National Key Research and Development Program, No. 2017YFC0909300 and No. 2016YFC0905001; Guangdong Provincial Key Laboratory of Construction Foundation, No. 2017B030314030 and No. 2020B1212060034; Science and Technology Program of Guangzhou, No. 201607020031; National Engineering and Technology Research Center for New Drug Druggability Evaluation (Seed Program of Guangdong Province), No. 2017B090903004; The 111 Project, No. B16047; China Postdoctoral Science Foundation, No. 2019M66324, No. 2020M683140 and No. 2020M683139; and Natural Science Foundation of Guangdong Province, No. 2022A1515012549 and No. 2023A1515012667.
Institutional review board statement: This study was reviewed and approved by the Ethics Committee of the Sixth Affiliated Hospital (No. E2016022), Sun Yat-Sen University, Guangzhou, China.
Informed consent statement: All study participants or their legal guardian provided informed written consent about personal and medical data collection prior to study enrolment.
Conflict-of-interest statement: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Data sharing statement: Technical appendix, statistical code, and dataset available from the corresponding author at wangxd@mail.sysu.edu.cn consent was not obtained but the presented data are anonymized and risk of identification is low.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Xue-Ding Wang, PharmD, Professor, Teacher, Institute of Clinical Pharmacology, School of Pharmaceutical Sciences, Sun Yat-sen University, No. 132 Waihuan Dong Road, Guangzhou 510006, Guangdong Province, China. wangxd@mail.sysu.edu.cn
Received: April 5, 2023
Peer-review started: April 5, 2023
First decision: April 21, 2023
Revised: May 7, 2023
Accepted: May 23, 2023
Article in press: May 23, 2023
Published online: June 28, 2023
Processing time: 84 Days and 1.2 Hours
Abstract
BACKGROUND

Thalidomide is an effective treatment for refractory Crohn’s disease (CD). However, thalidomide-induced peripheral neuropathy (TiPN), which has a large individual variation, is a major cause of treatment failure. TiPN is rarely predictable and recognized, especially in CD. It is necessary to develop a risk model to predict TiPN occurrence.

AIM

To develop and compare a predictive model of TiPN using machine learning based on comprehensive clinical and genetic variables.

METHODS

A retrospective cohort of 164 CD patients from January 2016 to June 2022 was used to establish the model. The National Cancer Institute Common Toxicity Criteria Sensory Scale (version 4.0) was used to assess TiPN. With 18 clinical features and 150 genetic variables, five predictive models were established and evaluated by the confusion matrix receiver operating characteristic curve (AUROC), area under the precision-recall curve (AUPRC), specificity, sensitivity (recall rate), precision, accuracy, and F1 score.

RESULTS

The top-ranking five risk variables associated with TiPN were interleukin-12 rs1353248 [P = 0.0004, odds ratio (OR): 8.983, 95% confidence interval (CI): 2.497-30.90], dose (mg/d, P = 0.002), brain-derived neurotrophic factor (BDNF) rs2030324 (P = 0.001, OR: 3.164, 95%CI: 1.561-6.434), BDNF rs6265 (P = 0.001, OR: 3.150, 95%CI: 1.546-6.073) and BDNF rs11030104 (P = 0.001, OR: 3.091, 95%CI: 1.525-5.960). In the training set, gradient boosting decision tree (GBDT), extremely random trees (ET), random forest, logistic regression and extreme gradient boosting (XGBoost) obtained AUROC values > 0.90 and AUPRC > 0.87. Among these models, XGBoost and GBDT obtained the first two highest AUROC (0.90 and 1), AUPRC (0.98 and 1), accuracy (0.96 and 0.98), precision (0.90 and 0.95), F1 score (0.95 and 0.98), specificity (0.94 and 0.97), and sensitivity (1). In the validation set, XGBoost algorithm exhibited the best predictive performance with the highest specificity (0.857), accuracy (0.818), AUPRC (0.86) and AUROC (0.89). ET and GBDT obtained the highest sensitivity (1) and F1 score (0.8). Overall, compared with other state-of-the-art classifiers such as ET, GBDT and RF, XGBoost algorithm not only showed a more stable performance, but also yielded higher ROC-AUC and PRC-AUC scores, demonstrating its high accuracy in prediction of TiPN occurrence.

CONCLUSION

The powerful XGBoost algorithm accurately predicts TiPN using 18 clinical features and 14 genetic variables. With the ability to identify high-risk patients using single nucleotide polymorphisms, it offers a feasible option for improving thalidomide efficacy in CD patients.

Keywords: Thalidomide-induced peripheral neuropathy; Refractory Crohn’s disease; Neurotoxicity prediction models; Machine learning; Gene polymorphisms

Core Tip: Thalidomide-induced peripheral neuropathy (TiPN) is a life-threatening condition in Crohn's disease and has a high incidence in Asia. However, there are no effective medical interventions for TiPN. Here, we established a predictive model using machine learning and identified genes closely related to TiPN occurrence. We have found that extreme gradient boosting algorithm can sensitively identify patients who are prone to TiPN, which is useful for doctors to adjust the thalidomide therapy.