Published online Jul 7, 2020. doi: 10.3748/wjg.v26.i25.3660
Peer-review started: March 16, 2020
First decision: April 25, 2020
Revised: May 8, 2020
Accepted: June 4, 2020
Article in press: June 4, 2020
Published online: July 7, 2020
Processing time: 112 Days and 1.3 Hours
The accurate classification of focal liver lesions (FLLs) is essential to properly guide treatment options and predict prognosis. Dynamic contrast-enhanced computed tomography (DCE-CT) is commonly used for the noninvasive detection and exact classification of FLLs due to its high scanning speed and high-density resolution. Since their recent development, convolutional neural network (CNN)-based deep learning techniques have been recognized to have high potential for image recognition tasks.
Since the different types of FLLs have different outcomes and require different clinical interventions, the current challenge in determining an accurate diagnosis involves not only effectively differentiating between benign and malignant FLLs according to the medical image but also accurately recognizing the different types of FLLs. Our purpose was to develop and evaluate a deep learning-based CNN to classify FLLs on multiphase CT. Our CNN model is expected to become an efficient tool to assist radiologists in accurately identifying the different types of FLLs.
The appearances, especially the dynamic enhancement patterns of FLLs on CT imaging, are essential for categorizing lesions. We employed a four-channel input data to preserve the dynamic enhancement properties. The combination of the lesion's dynamic enhancement pattern with a CNN can imitate the image diagnosis of radiologists and is expected to improve diagnostic accuracy.
A total of 517 FLLs scanned on a 320-detector CT scanner using a four-phase DCE-CT imaging protocol (including precontrast phase, arterial phase, portal venous phase, and delayed phase) from 2012 to 2017 were retrospectively enrolled. FLLs were classified into four categories: Category A, hepatocellular carcinoma (HCC); category B, liver metastases; category C, benign non-inflammatory FLLs including hemangiomas, focal nodular hyperplasias and adenomas; and category D, hepatic abscesses. Each category was split into a training set and test set in an approximately 8:2 ratio. The CNN model with a sequential input of the four-phase CT images was developed to automatically classify FLLs. The classification performance of CNN model was evaluated on the test set: The accuracy, specificity and sensitivity were calculated from the confusion matrix, and the area under the receiver operating characteristic curve (AUC) was calculated from the SoftMax probability outputted from the last layer of the CNN model.
A total of 410 FLLs were used for training and 107 FLLs were used for testing. The accuracy/specificity/sensitivity of differentiating each category from others were 0.916/0.964/0.739, 0.925/0.905/1.0, 0.860/0.918/0.735 and 0.925/0.963/0.815 for HCC, metastases, benign non-inflammatory FLLs, and abscesses on the test set, respectively. The AUC (95% confidence interval) for differentiating each category from others was 0.92 (0.837-0.992), 0.99 (0.967-1.00), 0.88 (0.795-0.955) and 0.96 (0.914-0.996) for HCC, metastases, benign non-inflammatory FLLs, and abscesses on the test set, respectively. Also, for this study, we only trained and evaluated the CNN model in a single center setting using a single CT scanner, where there might be a data bias that may lead to model bias. Further evaluation of this model in a multicenter setting is needed to evaluate its clinical utility.
Overall, our CNN model showed a high differential diagnostic performance for classification FLLs as HCC, metastases, benign non-inflammatory FLLs and hepatic abscesses in four-phase CT image and could become an efficient tool to assist radiologists in accurate identification of the different types of FLLs.
Further multicenter studies are necessary to evaluate the clinical utility of our CNN model. In addition, it’s worth to evaluate the clinical information whether can further improve the perform of CNN model.