Retrospective Study
Copyright ©The Author(s) 2020. Published by Baishideng Publishing Group Inc. All rights reserved.
World J Clin Oncol. Nov 24, 2020; 11(11): 918-934
Published online Nov 24, 2020. doi: 10.5306/wjco.v11.i11.918
Artificial intelligence in dentistry: Harnessing big data to predict oral cancer survival
Man Hung, Jungweon Park, Eric S Hon, Jerry Bounsanga, Sara Moazzami, Bianca Ruiz-Negrón, Dawei Wang
Man Hung, Jungweon Park, Sara Moazzami, College of Dental Medicine, Roseman University of Health Sciences, South Jordan, UT 84095, United States
Man Hung, Department of Orthopaedic Surgery Operations, University of Utah, Salt Lake City, UT 84108, United States
Man Hung, College of Social Work, University of Utah, Salt Lake City, UT 84112, United States
Man Hung, Division of Public Health, University of Utah, Salt Lake City, UT 84108, United States
Man Hung, Department of Educational Psychology, University of Utah, Salt Lake City, UT 84109, United States
Eric S Hon, Department of Economics, University of Chicago, Chicago, IL 60637, United States
Jerry Bounsanga, Research Section, Utah Medical Education Council, Salt Lake City, UT 84102, United States
Bianca Ruiz-Negrón, College of Social and Behavioral Sciences, University of Utah, Salt Lake City, UT 84112, United States
Dawei Wang, Data Analytics Unit, Walmart Inc., Bentonville, AR 72716, United States
Author contributions: Hung M and Hon ES contributed to study conception; Hung M provided study supervision; Hung M and Wang D contributed to research design, data analysis, visualization and results interpretation; Hung M, Hon ES and Bounsanga J contributed to data acquisition; Hung M, Park J, Moazzami S, Ruiz-Negrón B and Wang D contributed to manuscript drafting; Hung M, Park J, Hon ES, Bounsanga J and Wang D contributed to manuscript revision; all authors approved the final version of the manuscript.
Institutional review board statement: This is not a human subject research study. Per the United States federal regulations (45 CFR 46, category 4), this study is deemed exempt and does not require review from Institutional Review Board since the data were deidentified and publicly available.
Informed consent statement: This is not a human subject research study. This study used secondary data that were already collected and were publicly available online. Therefore, signed informed consent form is not relevant.
Conflict-of-interest statement: The authors declare that there is no conflict of interest regarding this work.
Data sharing statement: The data supporting the findings of this study can be accessed at: https://seer.cancer.gov/.
Open-Access: This article is an open-access article that was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution NonCommercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/Licenses/by-nc/4.0/
Corresponding author: Man Hung, PhD, Professor, Research Dean, College of Dental Medicine, Roseman University of Health Sciences, 10894 S River Front Parkway, South Jordan, UT 84095, United States. mhung@roseman.edu
Received: June 24, 2020
Peer-review started: June 24, 2020
First decision: September 18, 2020
Revised: October 6, 2020
Accepted: October 20, 2020
Article in press: October 20, 2020
Published online: November 24, 2020
Processing time: 147 Days and 10.3 Hours
Abstract
BACKGROUND

Oral cancer is the sixth most prevalent cancer worldwide. Public knowledge in oral cancer risk factors and survival is limited.

AIM

To come up with machine learning (ML) algorithms to predict the length of survival for individuals diagnosed with oral cancer, and to explore the most important factors that were responsible for shortening or lengthening oral cancer survival.

METHODS

We used the Surveillance, Epidemiology, and End Results database from the years 1975 to 2016 that consisted of a total of 257880 cases and 94 variables. Four ML techniques in the area of artificial intelligence were applied for model training and validation. Model accuracy was evaluated using mean absolute error (MAE), mean squared error (MSE), root mean squared error (RMSE), R2 and adjusted R2.

RESULTS

The most important factors predictive of oral cancer survival time were age at diagnosis, primary cancer site, tumor size and year of diagnosis. Year of diagnosis referred to the year when the tumor was first diagnosed, implying that individuals with tumors that were diagnosed in the modern era tend to have longer survival than those diagnosed in the past. The extreme gradient boosting ML algorithms showed the best performance, with the MAE equaled to 13.55, MSE 486.55 and RMSE 22.06.

CONCLUSION

Using artificial intelligence, we developed a tool that can be used for oral cancer survival prediction and for medical-decision making. The finding relating to the year of diagnosis represented an important new discovery in the literature. The results of this study have implications for cancer prevention and education for the public.

Keywords: Oral cancer survival; Machine learning; Artificial intelligence; Dental medicine; Public health; Surveillance, Epidemiology, and End Results; Quality of life

Core Tip: Oral cancer is the sixth most prevalent cancer worldwide. The goal of this study was to come up with machine learning algorithms to predict the length of oral cancer survival and to explore the most important factors that were responsible for it. Age at diagnosis, primary cancer site, tumor size and year of diagnosis were found to be the most important factors predictive of oral cancer survival. Year of diagnosis represents an important new discovery in the literature. Using artificial intelligence, we developed a tool that can be used for oral cancer survival prediction and for medical decision making.