Retrospective Study
Copyright ©The Author(s) 2020.
World J Clin Oncol. Nov 24, 2020; 11(11): 918-934
Published online Nov 24, 2020. doi: 10.5306/wjco.v11.i11.918
Table 1 Number of oral cancer cases from various anatomical sites
ICD-O-3 codesSitesNumber of cases
C000External upper lip413
C001External lower lip2444
C002External lip, NOS92
C003Mucosa of upper lip104
C004Mucosa of lower lip567
C005Mucosa of lip, NOS29
C006Commissure of lip85
C008Overlapping lesion of lip46
C009Lip, NOS (excludes skin of lip C44.0)153
C019Base of tongue, NOS10840
C020Dorsal surface of tongue, NOS652
C021Border of tongue2632
C022Ventral surface of tongue, NOS1688
C023Anterior 2/3 of tongue, NOS2807
C024Lingual tonsil170
C028Overlapping lesion of tongue581
C029Tongue, NOS3050
C030Upper gum821
C031Lower gum1680
C039Gum, NOS210
C040Anterior floor of mouth1362
C041Lateral floor of mouth352
C048Overlapping lesion of floor of mouth136
C049Floor of mouth, NOS2284
C050Hard palate1155
C051Soft palate, NOS (excludes nasopharyngeal surface of soft palate C11.3)1301
C052Uvula180
C058Overlapping lesion of palate206
C059Palate, NOS154
C060Cheek mucosa1787
C061Vestibule of mouth134
C062Retromolar area1413
C068Overlapping lesion of other and unspecified parts of mouth142
C069Mouth, NOS487
C079Parotid gland7111
C080Submandibular gland1149
C081Sublingual gland94
C088Overlapping lesion of major salivary glands6
C089Major salivary gland, NOS (excludes minor salivary gland, NOS C06.9)287
C090Tonsillar fossa1735
C091Tonsillar pillar888
C098Overlapping lesion of tonsil109
C099Tonsil, NOS (excludes lingual tonsil C02.4 and pharyngeal tonsil C11.1)9521
C100Vallecula282
C101Anterior surface of epiglottis88
C102Lateral wall of oropharynx184
C103Posterior wall of oropharynx246
C104Branchial cleft (site of neoplasm)37
C108Overlapping lesion of oropharynx277
C109Oropharynx, NOS940
C129Pyriform sinus1707
C130Postcricoid region78
C131Hypopharyngeal aspect of aryepiglottic fold, NOS (excludes laryngeal aspect of aryepiglottic fold C32.1)214
C132Posterior wall of hypopharynx250
C138Overlapping lesion of hypopharynx113
C139Hypopharynx, NOS816
C739Thyroid gland111425
Table 2 List of all 10 variables included in the final machine learning model building and validation
Variables
Variable description
Age at diagnosisThis data item represents the age of the patient at diagnosis for this cancer. The code is three digits and represents the patient’s actual age in years
Year of diagnosisThe year of diagnosis is the year the tumor was first diagnosed by a recognized medical practitioner, whether clinically or microscopically confirmed
Month of diagnosisThe month of diagnosis is the month the tumor was first diagnosed by a recognized medical practitioner, whether clinically or microscopically confirmed
Primary siteThis data item identifies the site in which the primary tumor originated. See the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3)[18] for topography codes. The decimal point is eliminated
CS tumor sizeInformation on tumor size. Available for 2004-2015 diagnosis years. Earlier cases may be converted and new codes added which weren't available for use prior to the current version of CS. For more information, see http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage[19]
CS extensionInformation on extension of the tumor. Available for 2004-2015 diagnosis years. Earlier cases may be converted and new codes added which weren't available for use prior to the current version of CS. For more information, see http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage[19]
CS lymph nodes evalAvailable for 2004-2015, but not required for the entire timeframe. Will be blank in cases not collected. For more information, see http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage[19]
Derived AJCC stage groupThis is the AJCC “Stage Group” component that is derived from CS detailed site-specific codes, using the CS algorithm, effective with 2004-2015 diagnosis years. See the CS site-specific schema for details (http://seer.cancer.gov/seerstat/variables/seer/ajcc-stage)[19]
RX Summ-surg prim siteSurgery of primary site describes a surgical procedure that removes and/or destroys tissue of the primary site performed as part of the initial work-up or first course of therapy
Site recode ICD-O-3/WHO 2008A recode based on primary site and ICD-O-3 Histology in order to make analyses of site/histology groups easier. For example, the lymphomas are excluded from stomach and Kaposi and mesothelioma are separate categories based on histology. For more information, see http://seer.cancer.gov/siterecode/icdo3_dwhoheme/index.html[20]
Table 3 Demographic characteristics of the sample (n = 177714)
Variable
Mean
SD
Median
n
%
Survival months/mo60.3540.9854.00
Age at diagnosis/yr54.6216.1055.00
Tumor size/(ID, cm)22.5621.7419.00
Marital status
Single3568820.08
Married11048062.17
Separated17460.98
Divorced164019.23
Widowed130557.35
Unmarried or domestic partner3440.19
Sex
Male7217940.62
Female10553559.38
Race
White14855683.60
Black160519.03
Other131077.38
Table 4 Machine learning model performance
Performance indicators
Linear regression
Decision tree
Random forest
XGBoost
MSE647.49538.30489.58486.55
RMSE25.4523.2022.1322.06
MAE18.2114.4513.6313.55
R2 score0.6200.6810.7090.711
Adjusted R2 score0.6200.6810.7090.711