Integrating radiomics and machine learning for the diagnosis and prognosis of hepatocellular carcinoma

doi:10.4251/wjgo.v17.i7.106610

Advanced Search

BPG is committed to discovery and dissemination of knowledge

Home / Archive / Volume 17, Issue 7

This Article

Academic Content and Language Evaluation of This Article

CrossCheck and Google Search of This Article

Academic Rules and Norms of This Article

Citation of this article

Corresponding Author of This Article

Research Domain of This Article

Article-Type of This Article

Open-Access Policy of This Article

Times Cited Counts in Google of This Article

Number of Hits and Downloads for This Article

Total Article Views (790)

All Articles published online

The chart showing PDF series, HTML series, Tables (1-1) series.

Item

Count

PDF

HTML

509

Tables (1-1)

Sum=624

Publishing Process of This Article

The chart showing Browse series, Download series.

Item

Count

Browse

Download

Sum=125

Jul 15, 2025 (publication date) through Aug 29, 2025

Times Cited of This Article

Times Cited (0)

Journal Information of This Article

Publication Name

World Journal of Gastrointestinal Oncology

ISSN

1948-5204

Publisher of This Article

Baishideng Publishing Group Inc, 7041 Koll Center Parkway, Suite 160, Pleasanton, CA 94566, USA

Editorial Open Access

World J Gastrointest Oncol. Jul 15, 2025; 17(7): 106610
Published online Jul 15, 2025. doi: 10.4251/wjgo.v17.i7.106610

Integrating radiomics and machine learning for the diagnosis and prognosis of hepatocellular carcinoma

Na Feng, Kun Wang, Yan Jiao

Na Feng, Kun Wang, Department of Sports Medicine, Orthopedics' Clinic, The First Hospital of Jilin University, Changchun 130021, Jilin Province, China

Yan Jiao, Department of Hepatobiliary and Pancreatic Surgery, General Surgery Center, The First Hospital of Jilin University, Changchun 130021, Jilin Province, China

ORCID number: Yan Jiao (0000-0001-6914-7949).

Co-corresponding authors: Kun Wang and Yan Jiao.

Author contributions: Feng N, Wang K, and Jiao Y collectively conceptualized and designed the research; Feng N contributed extensively to the manuscript writing, editing, and preparation of tables, as well as conducting the literature search and compiling relevant data; Wang K played a pivotal role in designing the overall conceptual framework, outlining the manuscript, critically interpreting data, and ensuring methodological rigor throughout the research process; Jiao Y contributed significantly to the intellectual content, participating actively in discussions regarding the manuscript structure, clinical relevance, and implications of the findings. Both Wang K and Jiao Y have served as co-corresponding authors, each providing indispensable contributions to the project. Wang K was instrumental in the foundational concept development, overall research design, and strategic oversight of manuscript preparation, while also supervising critical methodological aspects. Jiao Y led the integration of clinical perspectives, particularly focusing on the clinical implications, interpretability, and translational aspects of the manuscript. Furthermore, Jiao Y was responsible for coordinating the submission process and managing correspondence with the journal. The collaboration between Wang K and Jiao Y was essential for the successful completion of this manuscript, demonstrating complementary roles that significantly enhanced the scientific rigor, clarity, and practical relevance of the research. All authors have reviewed, approved the final manuscript, and endorse its publication.

Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.

Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/

Corresponding author: Yan Jiao, Department of Hepatobiliary and Pancreatic Surgery, General Surgery Center, The First Hospital of Jilin University, No. 1 Xinmin Street, Changchun 130021, Jilin Province, China. lagelangri1@126.com

Received: March 3, 2025
Revised: April 18, 2025
Accepted: May 23, 2025
Published online: July 15, 2025
Processing time: 134 Days and 3.9 Hours

Abstract

Hepatocellular carcinoma (HCC) is a prevalent and aggressive liver cancer that poses significant challenges in diagnosis and prognosis. Recent advancements in radiomics and machine learning (ML) offer promising solutions to enhance the accuracy of HCC diagnosis, treatment response prediction, and survival prognosis. Radiomics, which extracts quantitative features from medical images, captures the complex tumor heterogeneity that is often undetectable with traditional imaging methods. When combined with ML algorithms, these features can be used to differentiate between various stages of HCC, predict treatment outcomes, and assess long-term survival. This review explores key radiomic features, including texture, shape, and intensity, and their integration with ML techniques like binary classification models, XGBoost, LightGBM, and deep learning architectures. We also discuss the challenges faced in model interpretation, data heterogeneity, and the integration of multi-modal data. Despite the promising potential of these technologies, the clinical adoption of radiomics and ML models in HCC management will require overcoming these obstacles through standardization and improved interpretability.

Key Words: Hepatocellular carcinoma; Radiomics; Machine learning; Prognosis; Diagnosis

Core Tip: The integration of radiomics with machine learning (ML) algorithms holds significant promise in improving the diagnosis and prognosis of hepatocellular carcinoma. Key radiomic features, such as texture, shape, and intensity, when combined with advanced ML techniques, can enhance tumor characterization, predict treatment responses, and provide better prognostic insights. However, challenges related to data heterogeneity, model interpretability, and multi-modal data integration must be addressed for these technologies to be widely adopted in clinical practice.

Citation: Feng N, Wang K, Jiao Y. Integrating radiomics and machine learning for the diagnosis and prognosis of hepatocellular carcinoma. World J Gastrointest Oncol 2025; 17(7): 106610
URL: https://www.wjgnet.com/1948-5204/full/v17/i7/106610.htm
DOI: https://dx.doi.org/10.4251/wjgo.v17.i7.106610

INTRODUCTION

Hepatocellular carcinoma (HCC) is one of the most common and aggressive malignancies worldwide, often associated with poor prognosis due to late-stage diagnosis and complex tumor biology. While conventional imaging modalities such as computed tomography (CT) and magnetic resonance imaging (MRI) remain indispensable for diagnosis and staging, their limitations in capturing tumor heterogeneity and predicting individualized outcomes have prompted the exploration of advanced analytical tools. Recent advancements in radiomics-the high-throughput extraction of quantitative features from medical images-combined with machine learning (ML) algorithms have demonstrated significant potential in improving diagnostic accuracy, predicting treatment response, and estimating long-term prognosis in HCC[1]. Compared with earlier reviews, this Editorial provides an updated synthesis of recent developments from 2023 to 2025, with a particular emphasis on cutting-edge ML techniques [e.g., XGBoost, residual convolutional networks (ResNet)], integration of multi-modal data sources, and the growing importance of explainable artificial intelligence [e.g., SHapley Additive exPlanations (SHAP)] to enhance clinical interpretability (Table 1). In this context, we explore the evolving role of radiomics and ML in HCC, highlighting both their synergistic integration and the persistent challenges that must be addressed for clinical translation.

Table 1 Summary of representative studies applying radiomics and machine learning in hepatocellular carcinoma.

Ref.	Imaging modality	ML model/algorithm	Integrated features	Clinical application
Qi et al[2], 2024	CT	Logistic Regression with Radiomics	Texture features	Predict response to immunotherapy
Molostova et al[3], 2024	MRI	Radiomics + binary classification	Texture + Intensity	Differentiate early HCC from regenerative/dysplastic nodes
Wang et al[4], 2025	Multi-modal clinical data	Ensemble ML models	Clinical + Radiomics + Genomics	HCC diagnosis
Zhang et al[5], 2024	CT + clinical	XGBoost	Radiomics + clinical	Prognosis post-TACE
Şahin et al[6], 2025	CT	Deep learning (CNN)	Imaging only	Detect HCC from CT
Yin et al[7], 2025	CT	ResNet-based Deep learning	Imaging + clinical	Predict prognosis after combination therapy
Shen et al[9], 2024	Clinical + imaging	SHAP-integrated ML models	Multi-modal	Predict prognosis for advanced HCC
Cai et al[10], 2024	Radiomics + RNA-Seq	Survival analysis + ML	Radiomics + transcriptomics	Predict survival
Lou et al[11], 2024	Clinical + imaging	ML-based nomogram	Accessible clinical indicators	Predict prognosis

CT: Computed tomography; MRI: Magnetic resonance imaging; ML: Machine learning; HCC: Hepatocellular carcinoma; CNN: Convolutional neural network; ResNet: Residual convolutional networks; SHAP: SHapley Additive exPlanations.

RADIOMIC FEATURES IN HCC DIAGNOSIS

Radiomics involves the extraction of high-dimensional quantitative data from medical images. These features reflect the underlying heterogeneity of tumors, which can be pivotal in assessing tumor aggressiveness and predicting patient outcomes. Key radiomic features include texture, shape, and intensity characteristics.

Texture features

Texture analysis is used to capture the spatial arrangement of pixel intensities within an image. Features such as the Gray Level Co-occurrence Matrix and Gray Level Run Length Matrix have been shown to correlate with the prognosis and treatment response in HCC patients. For instance, texture features have been linked to the efficacy of immunotherapy in HCC, demonstrating their potential to predict short-term treatment responses[2].

Shape features

Tumor shape features, such as size, volume, and surface area, are crucial in distinguishing between different tumor types and assessing tumor progression. Shape features help identify the growth patterns of HCC, which can be critical for deciding on appropriate treatment strategies[2].

Intensity features

Intensity features measure the distribution of pixel intensities within a tumor. These features offer insights into the tumor's vascularity and heterogeneity, which are often not apparent through visual inspection of images. Intensity features are commonly combined with texture features to enhance diagnostic accuracy[2].

ML MODELS FOR HCC DIAGNOSIS AND PROGNOSIS

ML techniques have become an essential tool in integrating radiomic features with clinical data for improved HCC diagnosis and prognosis. Several advanced algorithms have been applied to process complex datasets, from radiomics to genomic data, to enhance prediction accuracy.

Binary classification models

ML models, particularly binary classification models, have been used to differentiate between early-stage HCC and atypical lesions. Enhanced MRI images combined with radiomic features have achieved area under the curve (AUC) values ranging from 0.89 to 0.95 in distinguishing regenerative from dysplastic nodes[3].

Gradient boosting algorithms

XGBoost and LightGBM are popular gradient boosting algorithms that have demonstrated superior performance in integrating radiomic features with clinical data for predicting tumor-node-metastasis (TNM) staging and prognosis. XGBoost, for example, has been shown to achieve a high prediction accuracy for HCC prognosis, outperforming other traditional models[4,5].

Deep learning models

Deep learning models, such as convolutional neural networks (CNNs) and the YOLO architecture, have revolutionized image analysis, particularly in HCC detection. These models are particularly effective in analyzing CT scans, where they can detect HCC with high diagnostic accuracy[6]. ResNet have also been utilized to extract complex image features, achieving high AUC values[7].

Prognostic models and validation

Integrated nomograms, combining clinical and radiomic features, have been developed to predict recurrence-free survival and overall survival in HCC patients. These models have shown strong predictive power, with C-index values indicating high accuracy in various cohorts[8]. Furthermore, external validation of ML models across diverse populations is critical to ensure generalizability and robustness[7].

CHALLENGES AND FUTURE DIRECTIONS

Challenges

Recent studies have demonstrated the evolving role of radiomics and ML integration in HCC, with varying emphases on diagnostic accuracy, prognostic power, and clinical interpretability. For instance, Yin et al[7] focused on long-term survival prediction using deep learning-based CT radiomics, while studies such as Şahin et al[6] and Molostova et al[3] prioritized diagnostic performance across different imaging modalities. Compared to these studies, our review underscores the complementary strengths of diverse ML architectures-from tree-based models like XGBoost[4,5] to CNNs-and highlights their potential for multi-modal data integration and performance optimization. Moreover, while previous reviews have often overlooked the issue of interpretability, we emphasize the value of SHAP-based explainability approaches[9], which may enhance clinical trust and decision transparency. Despite this promise, radiomics-ML tools remain primarily investigational. Although recent models have shown potential utility in predicting early recurrence post-TACE or stratifying patients for immunotherapy with high AUC values (e.g., Yin et al[7]; Qi et al[2]), widespread clinical adoption is limited by the need for prospective validation, standardization of imaging protocols, and the development of regulatory frameworks. These challenges must be addressed to translate algorithmic performance into meaningful improvements in clinical decision-making.

Data heterogeneity: The heterogeneity of HCC, combined with variability in imaging protocols and clinical data, poses significant challenges for model generalization. Standardization of imaging protocols and harmonization of data across different institutions are essential for improving the reproducibility and reliability of radiomic models.

Model interpretability: Complex ML models, such as deep learning, often lack interpretability, making it difficult for clinicians to understand how predictions are made. Techniques like SHAP are being explored to enhance model transparency, which is crucial for clinical adoption[9].

Integration of multi-modal data: The integration of multi-modal data, including clinical, radiomic, genomic, and histopathological information, is a promising avenue for improving the accuracy of HCC prediction models. For example, combining RNA sequencing data with radiomic features has shown to enhance prognostic prediction capabilities[10,11]. However, handling multi-modal data requires sophisticated algorithms capable of integrating different data types seamlessly.

Addressing imbalanced data: Data imbalance, especially in training datasets, is another significant challenge in ML. Techniques like data augmentation, balanced sampling, and synthetic data generation are being explored to address this issue and improve model performance.

Limitations

This editorial is narrative in nature and does not follow a systematic review protocol, which may introduce subjectivity in study selection and interpretation. While we aimed to include the most recent and representative studies in radiomics and ML applications for HCC, publication bias cannot be ruled out. Additionally, heterogeneity in imaging modalities, segmentation protocols, and ML architectures among the cited studies limits direct comparability. The clinical relevance of many models is also constrained by the lack of prospective validation and standardized evaluation metrics.

Future directions

To facilitate clinical adoption, future studies should prioritize multi-center collaborations with harmonized imaging protocols and standardized radiomic feature extraction pipelines. Prospective trials are essential to validate the prognostic and diagnostic performance of ML models in real-world settings. Another key direction is the integration of multi-modal data-including genomics, histopathology, and clinical biomarkers-using interpretable AI frameworks. Additionally, regulatory and ethical frameworks must be developed to address data privacy and algorithm transparency, both of which are essential for gaining clinician and patient trust.

CONCLUSION

This editorial adds to the current literature by highlighting the performance of novel ML models and the practical considerations of integrating radiomics with clinical and molecular data in HCC. It emphasizes the importance of explainability and standardization as critical steps toward real-world clinical application. Recent studies support the growing potential of radiomics and ML in enhancing diagnostic accuracy, refining prognostic assessments, and enabling more individualized risk stratification. While promising, these technologies remain investigational, and their incorporation into routine clinical practice will require further validation in large-scale, prospective cohorts and the development of standardized, interpretable frameworks.

Footnotes

Provenance and peer review: Invited article; Externally peer reviewed.

Peer-review model: Single blind

Specialty type: Oncology

Country of origin: China

Peer-review report’s classification

Scientific Quality: Grade C

Novelty: Grade C

Creativity or Innovation: Grade C

Scientific Significance: Grade C

P-Reviewer: Roever L S-Editor: Li L L-Editor: A P-Editor: Zhang L

References

Zhu Z, Wu J, Guo Y, Ren Q, Li D, Li Z, Han L. Prediction of Ki-67 expression in hepatocellular carcinoma with machine learning models based on intratumoral and peritumoral radiomic features. World J Gastrointest Oncol. 2025;17:104172. [RCA] [DOI] [Full Text] [Full Text (PDF)] [Reference Citation Analysis (0)]

2.	Qi L, Zhu Y, Li J, Zhou M, Liu B, Chen J, Shen J. CT radiomics-based biomarkers can predict response to immunotherapy in hepatocellular carcinoma. Sci Rep. 2024;14:20027. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)]

3.	Molostova IV, Medvedeva BM, Kondratyev EV, Ustalov AA, Novruzbekov MS, Olisov OD, Tarnoposky VM. The Capabilities of Machine Learning Radiomics Based Models in the MRI Diagnosis of Early HCC. J Oncol Diagn Radiol Radiother. 2024;7:68-73. [PubMed] [DOI] [Full Text]

Wang BW, Breitinger L, Tollens F, Itzel T, Grimm D, Sirazitdinov A, Frölich M, Schönberg S, Teufel A, Hesser J, Zhao WZ. A baseline for machine-learning-based hepatocellular carcinoma diagnosis using multi-modal clinical data. 2025 Preprint. Available from: arxiv:2501.11535. [DOI] [Full Text]

Zhang M, Kuang B, Zhang J, Peng J, Xia H, Feng X, Peng L. Enhancing prognostic prediction in hepatocellular carcinoma post-TACE: a machine learning approach integrating radiomics and clinical features. Front Med (Lausanne). 2024;11:1419058. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 2] [Reference Citation Analysis (0)]

Şahin E, Tatar OC, Ulutaş ME, Güler SA, Şimşek T, Turgay NZ, Cantürk NZ. Diagnostic Performance of Deep Learning Applications in Hepatocellular Carcinoma Detection Using Computed Tomography Imaging. Turk J Gastroenterol. 2024;36:124-130. [RCA] [PubMed] [DOI] [Full Text] [Reference Citation Analysis (0)]

Yin L, Liu R, Li W, Li S, Hou X. Deep learning-based CT radiomics predicts prognosis of unresectable hepatocellular carcinoma treated with TACE-HAIC combined with PD-1 inhibitors and tyrosine kinase inhibitors. BMC Gastroenterol. 2025;25:24. [RCA] [PubMed] [DOI] [Full Text] [Cited by in Crossref: 1] [Reference Citation Analysis (0)]

Zhou J, Yang D, Tang H. Magnetic resonance imaging radiomics based on artificial intelligence is helpful to evaluate the prognosis of single hepatocellular carcinoma. Heliyon. 2025;11:e41735. [RCA] [PubMed] [DOI] [Full Text] [Full Text (PDF)] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

Shen J, Zhou Y, Pei J, Yang D, Zhao K, Ding Y. Development of prognostic models for advanced multiple hepatocellular carcinoma based on Cox regression, deep learning and machine learning algorithms. Front Med (Lausanne). 2024;11:1452188. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

10.	Cai K, Fu W, Wang Z, Yang X, Liu H, Ji Z. Optimizing Prognostic Predictions in Liver Cancer with Machine Learning and Survival Analysis. Entropy (Basel). 2024;26:767. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 1] [Reference Citation Analysis (0)]

11.

Lou X, Ma S, Ma M, Wu Y, Xuan C, Sun Y, Liang Y, Wang Z, Gao H. The prognostic role of an optimal machine learning model based on clinical available indicators in HCC patients. Front Med (Lausanne). 2024;11:1431578. [RCA] [PubMed] [DOI] [Full Text] [Cited by in RCA: 2] [Reference Citation Analysis (0)]