Retrospective Study
Copyright ©The Author(s) 2025. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Gastroenterol. May 21, 2025; 31(19): 104897
Published online May 21, 2025. doi: 10.3748/wjg.v31.i19.104897
Application of deep learning models in the pathological classification and staging of esophageal cancer: A focus on Wave-Vision Transformer
Wei Wei, Xiao-Lei Zhang, Hong-Zhen Wang, Lin-Lin Wang, Jing-Li Wen, Xin Han, Qian Liu
Wei Wei, Xiao-Lei Zhang, Hong-Zhen Wang, Jing-Li Wen, Xin Han, Qian Liu, Department of Oncology, Dongying People’s Hospital, Dongying 257091, Shandong Province, China
Lin-Lin Wang, Department of Pathology, Dongying People’s Hospital, Dongying 257091, Shandong Province, China
Author contributions: Zhang XL and Wang HZ contributed to the experimental conception and design; Wang LL, Wen JL, and Han X conducted the experiments; Wei W and Liu Q collected and assembled the experimental data; Wang LL, Wen JL, and Han X contributed to data analysis and interpretation; Zhang XL and Wang HZ wrote the article. All authors approved the final manuscript. Wang LL, Wen JL, and Han X contributed equally to this work.
Institutional review board statement: This study did not involve human subjects or living animals.
Informed consent statement: Not applicable.
Conflict-of-interest statement: All the authors report no relevant conflicts of interest for this article.
Data sharing statement: All data can be provided as needed.
Open Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: https://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Wei Wei, PhD, Department of Oncology, Dongying People’s Hospital, No. 317 Dongcheng South Road, Dongying District, Dongying 257091, Shandong Province, China. ww19810122@163.com
Received: January 5, 2025
Revised: March 12, 2025
Accepted: April 27, 2025
Published online: May 21, 2025
Processing time: 135 Days and 18.8 Hours
Abstract
BACKGROUND

Esophageal cancer is the sixth most common cancer worldwide, with a high mortality rate. Early prognosis of esophageal abnormalities can improve patient survival rates. The progression of esophageal cancer follows a sequence from esophagitis to non-dysplastic Barrett’s esophagus, dysplastic Barrett’s esophagus, and eventually esophageal adenocarcinoma (EAC). This study explored the application of deep learning technology in the precise diagnosis of pathological classification and staging of EAC to enhance diagnostic accuracy and efficiency.

AIM

To explore the application of deep learning models, particularly Wave-Vision Transformer (Wave-ViT), in the pathological classification and staging of esophageal cancer to enhance diagnostic accuracy and efficiency.

METHODS

We applied several deep learning models, including multi-layer perceptron, residual network, transformer, and Wave-ViT, to a dataset of clinically validated esophageal pathology images. The models were trained to identify pathological features and assist in the classification and staging of different stages of esophageal cancer. The models were compared based on accuracy, computational complexity, and efficiency.

RESULTS

The Wave-ViT model demonstrated the highest accuracy at 88.97%, surpassing the transformer (87.65%), residual network (85.44%), and multi-layer perceptron (81.17%). Additionally, Wave-ViT exhibited low computational complexity with significantly reduced parameter size, making it highly efficient for real-time clinical applications.

CONCLUSION

Deep learning technology, particularly the Frequency-Domain Transformer model, shows promise in improving the precision of pathological classification and staging of EAC. The application of the Frequency-Domain Transformer model enhances the automation of the diagnostic process and may support early detection and treatment of EAC. Future research may further explore the potential of this model in broader medical image analysis applications, particularly in the field of precision medicine.

Keywords: Esophageal cancer; Deep learning; Wave-Vision Transformer; Pathological classification; Staging; Early detection

Core Tip: This study demonstrates the application of deep learning models, particularly Wave-Vision Transformer, for the pathological classification and staging of esophageal cancer. Wave-Vision Transformer outperformed other models such as transformer, residual network, and multi-layer perceptron, achieving the highest accuracy of 88.97% with low computational complexity. This innovative approach shows promise for improving early detection and personalized treatment strategies for esophageal cancer, potentially enhancing clinical outcomes in real-time applications.