Published online May 21, 2025. doi: 10.3748/wjg.v31.i19.104897
Revised: March 12, 2025
Accepted: April 27, 2025
Published online: May 21, 2025
Processing time: 135 Days and 18.8 Hours
Esophageal cancer is the sixth most common cancer worldwide, with a high mortality rate. Early prognosis of esophageal abnormalities can improve patient survival rates. The progression of esophageal cancer follows a sequence from esophagitis to non-dysplastic Barrett’s esophagus, dysplastic Barrett’s esophagus, and eventually esophageal adenocarcinoma (EAC). This study explored the application of deep learning technology in the precise diagnosis of pathological classification and staging of EAC to enhance diagnostic accuracy and efficiency.
To explore the application of deep learning models, particularly Wave-Vision Transformer (Wave-ViT), in the pathological classification and staging of eso
We applied several deep learning models, including multi-layer perceptron, residual network, transformer, and Wave-ViT, to a dataset of clinically validated esophageal pathology images. The models were trained to identify pathological features and assist in the classification and staging of different stages of eso
The Wave-ViT model demonstrated the highest accuracy at 88.97%, surpassing the transformer (87.65%), residual network (85.44%), and multi-layer perceptron (81.17%). Additionally, Wave-ViT exhibited low computational complexity with significantly reduced parameter size, making it highly efficient for real-time clinical applications.
Deep learning technology, particularly the Frequency-Domain Transformer model, shows promise in improving the precision of pathological classification and staging of EAC. The application of the Frequency-Domain Transformer model enhances the automation of the diagnostic process and may support early detection and treatment of EAC. Future research may further explore the potential of this model in broader medical image analysis applications, particularly in the field of precision medicine.
Core Tip: This study demonstrates the application of deep learning models, particularly Wave-Vision Transformer, for the pathological classification and staging of esophageal cancer. Wave-Vision Transformer outperformed other models such as transformer, residual network, and multi-layer perceptron, achieving the highest accuracy of 88.97% with low computational complexity. This innovative approach shows promise for improving early detection and personalized treatment strategies for esophageal cancer, potentially enhancing clinical outcomes in real-time applications.