Published online Jul 6, 2019. doi: 10.12998/wjcc.v7.i13.1611
Peer-review started: March 28, 2019
First decision: May 15, 2019
Revised: May 15, 2019
Accepted: May 16, 2019
Article in press: May 17, 2019
Published online: July 6, 2019
Processing time: 101 Days and 0.2 Hours
The incidence of pancreatic neuroendocrine tumors (PNETs) is now increasing rapidly. The tumor grade of PNETs significantly affects the treatment strategy and prognosis. However, there is still no effective way to non-invasively classify PNET grades. Machine learning (ML) algorithms have shown potential in improving the prediction accuracy using comprehensive data.
To provide a ML approach to predict PNET tumor grade using clinical data.
The clinical data of histologically confirmed PNET cases between 2012 and 2018 were collected. A method of minimum P for the Chi-square test was used to divide the continuous variables into binary variables. The continuous variables were transformed into binary variables according to the cutoff value, while the P value was minimum. Four classical supervised ML models, including logistic regression, support vector machine (SVM), linear discriminant analysis (LDA) and multi-layer perceptron (MLP) were trained by clinical data, and the models were labeled with the pathological tumor grade of each PNET patient. The performance of each model, including the weight of the different parameters, were evaluated.
In total, 91 PNET cases were included in this study, in which 32 were G1, 48 were G2 and 11 were G3. The results showed that there were significant differences among the clinical parameters of patients with different grades. Patients with higher grades tended to have higher values of total bilirubin, alpha fetoprotein, carcinoembryonic antigen, carbohydrate antigen 19-9 and carbohydrate antigen 72-4. Among the models we used, LDA performed best in predicting the PNET tumor grade. Meanwhile, MLP had the highest recall rate for G3 cases. All of the models stabilized when the sample size was over 70 percent of the total, except for SVM. Different parameters varied in affecting the outcomes of the models. Overall, alanine transaminase, total bilirubin, carcinoembryonic antigen, carbohydrate antigen 19-9 and carbohydrate antigen 72-4 affected the outcome greater than other parameters.
ML could be a simple and effective method in non-invasively predicting PNET grades by using the routine data obtained from the results of biochemical and tumor markers.
Core tip: In this study, we provide a machine learning approach to predict the grade of pancreatic neuroendocrine tumors (PNETs) using combined clinical data. We design a method of minimum P for the Chi-square test to maximize differences between groups, which benefited the model’s construction. Then, we proposed four classical supervised machine learning models by using biochemical and tumor markers. After the tuning, training and testing of the models, we made sure that the trained models gave stable results. In general, the result of our study provided a non-invasive way to judge the condition of PNETs and offers a reference for treatment.