Copyright
©The Author(s) 2021.
Artif Intell Gastrointest Endosc. Aug 28, 2021; 2(4): 127-135
Published online Aug 28, 2021. doi: 10.37126/aige.v2.i4.127
Published online Aug 28, 2021. doi: 10.37126/aige.v2.i4.127
Category | Description |
LR-1 | Definitely benign |
LR-2 | Probably benign |
LR-3 | Intermediate probability of HCC |
LR-4 | High probability of HCC, not 100% |
LR-5 | Definitely HCC |
LR-5V | Definite venous invasion regardless of other imaging features |
LR treated | LR-5 lesion status post-locoregional treatment |
LR-M | Non-HCC malignancies that may occur in cirrhosis: metastases, lymphoma, cholangiocarcinoma, PTLD |
Ref. | Country | Deep learning method | Accuracy | Sensitivity | Specificity | AUROC | DLS performance compared | Multicenter validation | Conclusion |
Hamm et al[8], 2019 | United States | Proof-of-concept validation CNN | 92% | 92% | 98% | 0.992 | Better than radiologists | Not done | DLS was feasibility for classifying lesions with typical imaging features from six common hepatic lesion types |
Yamashita et al[14], 2020 | United States | CNN architectures: custom-made network and transfer learning-based network | 60.4% | NA | NA | LR-1/2: 0.85. LR-3: 0.90. LR-4: 0.63. LR-5: 0.82 | Transfer learning model was better | Performed | There is a feasibility of CNN for assigning LI-RADS categories from a relatively small dataset but highlights the challenges of model development and validation |
Shi et al[23], 2020 | China | Three CDNs | Model-A: 83.3%, B: 81.1%, C: 85.6% | NA | NA | Model-A: 0.925; B: 0.862; C: 0.920 | Three model compared, A and C with better results | Not done | Three-phase CT protocol without precontrast showed similar diagnosis accuracy as four-phase protocol in differentiating HCC. It can reduce the radiation dose |
Yasaka et al[25], 2018 | Japan | CNN | 84% | Category1: A: 71%; B: 33%; C: 94%; D: 90%; E: 100% | NA | 0.92 | Not applicable | Not done | Deep learning with CNN showed high diagnostic performance in differentiation of liver masses at dynamic CT |
Trivizakis et al[28], 2019 | Greece | 3D and 2D CNN | 83% | 93% | 67% | 0.80 | Superior compared with 2D CNN model | Not done | 3D CNN architecture can bring significant benefit in DW-MRI liver discrimination and potentially in numerous other tissue classification problems based on tomographic data, especially in size-limited, disease specific clinical datasets |
Wang et al[41], 2019 | United States | Proof-of-concept “interpretable” CNN | 88% | 82.9% | NA | NA | Not applicable | Not done | This interpretable deep learning system demonstrates proof of principle for illuminating portions of a pre-trained deep neural network’s decision-making, by analyzing inner layers and automatically describing features contributing to predictions |
Frid-Adar et al[45], 2018 | Israel | GANs | Classic data: 78.6%. Synthetic data: 85.7% | Classic data: 78.6%. Synthetic data: 85.7% | Classic data: 88.4%. Synthetic data: 92.4% | NA | Synthetic data augmentation is better than classic data augmentation | Not done | This approach to synthetic data augmentation can generalize to other medical classification applications and thus support radiologists’ efforts to improve diagnosis |
Wang et al[47], 2019 | Japan | CNN with clinical data | NA | NA | NA | Clinical model: 0.723. Model: A: 0.788; B: 0.805; C: 0825. | Combined model C present with better results | Not done | The AUC of the combined model is about 0.825, which is much better than the models using clinical data only or CT image only |
Sato et al[48], 2019 | Japan | Fully connected neural network with 4 layers of neurons using only biomarkers, gradient boosting (non-linear model) and others | DLS: 83.54%. Gradient boosting: 87.34% | Gradient boosting: 93.27% | Gradient boosting: 75.93% | DLS: 0.884. Gradient boosting: 0.940 | Deep learning was not the optimal classifier in the current study | Not done | The gradient boosting model reduced the misclassification rate by about half compared with a single tumor marker. The model can be applied to various kinds of data and thus could potentially become a translational mechanism between academic research and clinical practice |
Naeem et al[49], 2020 | Pakistan | MLP, SVM, RF, and J48 using ten-fold cross-validation | MLP: 99% | NA | NA | MLP: 0.983. SVM: 0.966. RF: 0.964. J48: 0.959 | MLP model present with better results | Radiopaedia dataset | Our proposed system has the capability to verify the results on different MRI and CT scan databases, which could help radiologists to diagnose liver tumors |
Algorithm | Pros | Cons |
Naïve Bayes Classifier | Simple, easy and fast. Not sensitive to irrelevant features. Works great in practice. Needs less training data. For both multi-class and binary classification. Works with continuous and discrete data | Accepts every feature as independent. This is not always the truth |
Decision Trees | Easy to understand. Easy to generate rules. There are almost no hyperparameters to be tuned. Complex decision tree models can be significantly simplified by its visualizations | Might suffer from overfitting. Does not easily work with nonnumerical data. Low prediction accuracy for a dataset in comparison with other algorithms. When there are many class labels, calculations can be complex |
Support Vector Machines | Fast algorithm. Effective in high dimensional spaces. Great accuracy. Power and flexibility from kernels. Works very well with a clear margin of separation. Many applications | Does not perform well with large data sets. Not so simple to program. Does not perform so well when the data comes with more noise i.e. target classes are overlapping |
Random Forest Classifier | The overfitting problem does not exist. Can be used for feature engineering i.e. for identifying the most important features among all available features in the training dataset. Runs very well on large databases. Extremely flexible and have very high accuracy. No need for preparation of the input data | Complexity. Requires a lot of computational resources. Time-consuming. Need to choose the number of trees |
KNN Algorithm | Simple to understand and easy to implement. Zero to little training time. Works easily with multi-class data sets. Has good predictive power. Does well in practice | Computationally expensive testing phase. Can have skewed class distributions. The accuracy can be decreased when it comes to high-dimension data. Needs to define a value for the parameter k |
- Citation: Ballotin VR, Bigarella LG, Soldera J, Soldera J. Deep learning applied to the imaging diagnosis of hepatocellular carcinoma. Artif Intell Gastrointest Endosc 2021; 2(4): 127-135
- URL: https://www.wjgnet.com/2689-7164/full/v2/i4/127.htm
- DOI: https://dx.doi.org/10.37126/aige.v2.i4.127