Hajizadeh N, Pourhoseingholi MA, Baghestani AR, Abadi A, Zali MR. Bayesian adjustment of gastric cancer mortality rate in the presence of misclassification. World J Gastrointest Oncol 2017; 9(4): 160-165 [PMID: 28451063 DOI: 10.4251/wjgo.v9.i4.160]
Corresponding Author of This Article
Mohamad Amin Pourhoseingholi, PhD, Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Arabi Ave, Daneshjoo Blvd, Velenjak, Tehran 1985717413, Iran. amin_phg@yahoo.com
Research Domain of This Article
Gastroenterology & Hepatology
Article-Type of This Article
Observational Study
Open-Access Policy of This Article
This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Nastaran Hajizadeh, Ahmad Reza Baghestani, Alireza Abadi, Department of Biostatistics, Shahid Beheshti University of Medical Sciences, Tehran 1971653313, Iran
Mohamad Amin Pourhoseingholi, Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
Mohammad Reza Zali, Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran 1985717413, Iran
ORCID number: $[AuthorORCIDs]
Author contributions: Pourhoseingholi MA was principal investigator and contributing in writing the manuscript; Hajizadeh N contributed to study conception and data analysis; Baghestani AR and Abadi A contributed to study conception and design; Zali MR contributed to interpretation the results; all authors contributed to editing, reviewing and final approval of the article.
Institutional review board statement: The study was reviewed and approved by research committee of research institute for gastroenterology and liver diseases (Tehran).
Informed consent statement: Hereby it is attested that this manuscript which is submitted for publication in World Journal of Gastrointestinal Oncology has been read and approved by all authors, has not been published, totally or partly, in any other journal.
Conflict-of-interest statement: There are no conflicts of interest to report.
Data sharing statement: No additional data are available.
Open-Access: This article is an open-access article which was selected by an in-house editor and fully peer-reviewed by external reviewers. It is distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/
Correspondence to: Mohamad Amin Pourhoseingholi, PhD, Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Arabi Ave, Daneshjoo Blvd, Velenjak, Tehran 1985717413, Iran. amin_phg@yahoo.com
Telephone: +98-21-22432515 Fax: +98-21-22432517
Received: August 25, 2016 Peer-review started: August 26, 2016 First decision: September 27, 2016 Revised: December 24, 2016 Accepted: January 11, 2017 Article in press: January 12, 2017 Published online: April 15, 2017 Processing time: 227 Days and 23.3 Hours
Abstract
AIM
To correct for misclassification error in registering causes of death in Iran death registry using Bayesian method.
METHODS
National death statistic from 2006 to 2010 for gastric cancer which reported annually by the Ministry of Health and Medical Education included in this study. To correct the rate of gastric cancer mortality with reassigning the deaths due to gastric cancer that registered as cancer without detail, a Bayesian method was implemented with Poisson count regression and beta prior for misclassified parameter, assuming 20% misclassification in registering causes of death in Iran.
RESULTS
Registered mortality due to gastric cancer from 2006 to 2010 was considered in this study. According to the Bayesian re-estimate, about 3%-7% of deaths due to gastric cancer have registered as cancer without mentioning details. It makes an undercount of gastric cancer mortality in Iranian population. The number and age standardized rate of gastric cancer death is estimated to be 5805 (10.17 per 100000 populations), 5862 (10.51 per 100000 populations), 5731 (10.23 per 100000 populations), 5946 (10.44 per 100000 populations), and 6002 (10.35 per 100000 populations), respectively for years 2006 to 2010.
CONCLUSION
There is an undercount in gastric cancer mortality in Iranian registered data that researchers and authorities should notice that in sequential estimations and policy making.
Core tip: In some mortality cases, causes of deaths are registered as causes that cannot or should not be considered as the underlying causes of death like cancer without mentioning the type. These cases are not included in the estimations of cause specific mortality rates and leads to under-estimate health risks and burden of disease. The aim of this study is to correct the misclassification of gastric cancer deaths in cancer without label group using a Bayesian method.
Citation: Hajizadeh N, Pourhoseingholi MA, Baghestani AR, Abadi A, Zali MR. Bayesian adjustment of gastric cancer mortality rate in the presence of misclassification. World J Gastrointest Oncol 2017; 9(4): 160-165
Cancer is one of the major health problems in the world and is the third cause of death (after cardiovascular disease and injuries) in Iran[1]. Gastric cancer is a disease in which the cells of the inner lining of the stomach start to divide abnormally and uncontrollably, that forming a mass called tumor[2]. Gastric cancer is the seventh cause of all deaths in Iran and is the first cause of cancer death in Iranian men and the second cause of cancer death (after breast cancer) in Iranian women[3]. The mortality of gastric cancer is high because this cancer does not show symptoms in early stages and diagnosed when the cancer is in its final stages[4].
Burden of disease is used to evaluate the health status of a country and determining priority of risk factors in order to setup cancer control programs. Cancer registry data are important to estimate the burden of disease, monitoring the screening programs effects, early diagnostics and other prognostic factors, and can be used to guide policy makers to appropriate cancer prevention programs. Among medical indices, mortality is a familiar projection to assess the burden of diseases. But achieving this aim requires a reliable death registry systems that reports death statistics accurately and completely[5-7]. In Iran, among four vital events (births, marriages, divorces and mortality) which were registered by the National Organization for Civil Registration (NOCR), mortality was the worst in quality. There was some progress in registering deaths but some problems like delayed registration and inaccurate recording of causes of death remained until 2002, that Ministry of health and medical education Deputy of Research and Technology, started up a system to record the causes of deaths. This system did not allow to delayed deaths registry, but the causes of death were susceptible to information bias due to misclassification[8]. Most high-income and many middle-income countries have a complete vital registration system in which the majority of deaths get a death certificate completed by a physician[9]. But still, a number of causes of death in the process of completing death certificates and the coding of underlying cause of death based on standardized international rules, remains challenging[10-13]. In some cases, especially in developing countries, the cause of death is recorded with error[14,15]. For example if a death due to gastric cancer being labeled as a death due to any other cause, the misclassification error in outcome is occurs. Misclassification error makes the registered data inaccurate and often leads to major problems like biased estimates of burden and health risks in epidemiological analysis[16,17].
According to the Iranian death registry, about 15% to 20% of death statistics are recorded in misclassified categories such as cardiopulmonary arrest, old age without dementia, septicemia, unknown, cancer without mention of details, and other ill-defined conditions. Murray and Lopez in 1996, for the first time, introduced the term “garbage coding” for assigning deaths to causes that are not useful for public health analysis of cause-of-death data[18-21].
In developing countries like Iran that registration is not completely accurate, statistical methods can be very helpful to overcome this problem. Two statistical approaches are recommended to deal with misclassification; first is using a small valid sample and extending the results to the population[22] and the second is Bayesian analysis which is a flexible method that makes the possibility of combining the prior information regarding the subset of the parameters with the observed data to achieve a posterior distribution which will be the basis of inferences to correct the statistics. Bayesian models also can easily accommodate unobserved variables such as an individual’s true information in the presence of Misclassification error[23]. The aim of this study is to use Bayesian method to estimate the rate of misclassification that occurs by registering cancer (with no label) as the cause of death instead of deaths that have occurred because of gastric cancer in Iran’s cancer registry system.
MATERIALS AND METHODS
Mortality rates due to gastric cancer and also cancer without label from 2006 to 2010 are extracted from Iranian annual of death statistics which reported annually by Iran’s Ministry of Health and Medical Education, in two sex groups (male and female) and four age groups (under 15 years - 15 to 49 years - 50 to 69 years - 70 years and more).
To reassign deaths from garbage codes to valid causes, the approach can be divided into three steps: The first is identifying garbage codes. The second is identifying the target causes where the deaths assigned to a garbage code should in principle be reassigned to; for example if a death cause is registered as cancer and the type of cancer is not mentioned, we face with a garbage code that should be reassigned to a specific cancer. The third step is choosing the fraction of deaths that are assigned to the garbage code that should be reallocated to the target cause[13]. In this study we consider cancer without label as garbage code because cancer with no label is most likely to be registered as cause of death instead of a specific cancer like gastric cancer. The data were entered to the Bayesian model by two vectors y1 = [y11, y21,...,yr1] for gastric cancer and y2 = [y12, y22,...,yr2] for cancer without label. Both y1 and y2 are count data and follow the Poisson distribution. The subscript r shows the number of covariate patterns that is made by age and sex group combinations. θ is considered to be the probability of incorrectly register a mortality from gastric cancer as mortality due to cancer without label group. To perform Bayesian inference, an informative beta prior distribution was assumed for the misclassified parameter, i.e., θ ~ beta (a, b). The initial value for the parameter of beta distribution are taken to be a = 20 and b = 80, based on Iranian annual cancer registration reports. Since θ(misclassified parameter) is an unknown parameter, a latent variable approach was employed to simplify the full conditional models; considering Ui | θ, y1, y2 ~ Binomial (yi2, Pi) as the number of counts from the first group that are incorrectly labeled as being in the misclassified group that Pi = (λi1θ)/(λi1θ+ λi2), finally the posterior distribution appears in the following form; θ | Ui, y1, y2 ~ Beta (∑iUi+ a, ∑iyi+ b). The misclassified parameter is estimated using a Gibbs sampling algorithm and averaging of the outcome. Analyses were done using R software version 3.2.0.
RESULTS
Mortality data consisting of all deaths due to gastric cancer from 2006 to 2010 were considered in this study. Age standardized rate (ASR) of gastric cancer mortality was 9.69 per 100000 populations in 2006, 10.2 per 100000 populations in 2007, 9.93 per 100000 populations in 2008, 9.76 per 100000 populations in 2009 and 9.67 per 100000 populations in 2010 respectively. According to the Bayesian estimation, in year 2006, there was between 3% to 7% misclassification in registering cause of death as cancer without mentioning details while the underlying cause of death has been gastric cancer. The estimated percent of misclassification based on implemented Bayesian method for year 2006 to 2010 is shown in Table 1. This percent were subtracted from deaths that had registered as cancer without mentioning details and added to the number of deaths due to gastric cancer. The age standardized rate per 100000 populations for gastric cancer was estimated to be 10.17 in 2006, 10.51 in 2007, 10.23 in 2008 10.44 in 2009 and 10.35 in 2010, after Bayesian correction respectively. The age standardizes rate of gastric cancer before and after Bayesian correction for 2006 to 2010 is visualized in Figure 1. The number of gastric cancer death before and after Bayesian correction of misclassification for years 2006 to 2010 is shown in Table 1 and its trend is shown in Figure 2.
Table 1 Misclassified parameter and the number of gastric cancer death before and after Bayesian correction and percent of increase in number of deaths after Bayesian correction in Iran’s death registry 2006-2010.
Figure 1 Age standardized rate of gastric cancer mortality in Iran from 2006 to 2010, before and after Bayesian correction of misclassification in causes of death.
ASR: Age standardized rate.
Figure 2 Crude number of gastric cancer mortality in Iran from 2006 to 2010, before and after Bayesian correction of misclassification in causes of death.
DISCUSSION
Iran’s death registry is subject to misclassification in reporting the underlying cause of death. About 3%-7% of deaths due to gastric cancer are registered as cancer without mentioning the type of cancer. After correcting misclassification error in death registry data, the number of deaths due to gastric cancer and its age standardized rate were increased. Gastric cancer crude mortality count in Iran had an increasing trend from year 2006 to 2010 except for 2008 that might be because of incompleteness of data; but the age standardized rate of gastric cancer was decreasing from year 2007 onward (except for 2008). About two-thirds of gastric cancer occurs in developing countries[24-27] and its rates are generally about twice as high in men as in women[28]. The age standardized rate (ASR) of gastric cancer incidence and mortality per 1000000 populations based on GLOBOCAN report 2012 is shown in Table 2. The rates show that the ASR of gastric cancer incidence (15.8 per 100000) and also the ASR of gastric cancer mortality (11.7 per 100000) is highest in Asia compared to other continents; It is moderate in Europe and South America and lowest in Northern America and most parts of Africa[3,28].
Table 2 Incidence and mortality age standardized rates per 100000 populations due to gastric cancer for some continents, reported by GLOBOCAN 2012.
Continent
Incidence ASR
Mortality ASR
World
12.1
8.9
Asia
15.8
11.7
Europe
9.4
6.9
South America
10.3
8.5
North America
4.0
2.1
Africa
3.8
3.5
The age standardized rates of incidence and mortality per 100000 populations in different regions of Asia based on GLOBOCAN report 2012 are shown in Table 3. The incidence and mortality rates are also higher in Eastern Asia in comparison with other Asian regions. This region includes China, Japan and South Korea, that are three countries with the highest gastric cancer incidence and mortality rates[29]. Gastric cancer is the most frequently diagnosed form of cancer in Iran[30], with incidence rate 15.3 per 100000 and mortality rate 12.9 per 100000 populations based on GLOBOCAN report 2012[3]. A steady decline has been observed in gastric cancer incidence and mortality rates in the most of countries in Northern America and Europe since the middle of the 20th century[31,32]. In recent years similar decreasing trends have been noted in areas with high rates of gastric cancer history, including some countries in Asia (Japan, China, and South Korea), Latin America (Colombia and Ecuador), and Europe (Ukraine)[33]. This reduction maybe due to improved sanitation and antibiotics and consequently reduction in chronic H. pylori infection[34]. Although the age-adjusted rates have been decreased, it is estimated to have a substantial rise in the crude rates between the years 2000 to 2020 because of the increasing the size and age of the world population, especially in developing countries[35,36].
Table 3 Incidence and mortality age standardized rates per 100000 populations due to gastric cancer for different regions of Asia, reported by GLOBOCAN 2012.
Region
Incidence ASR
Mortality ASR
Eastern Asia
24.2
16.5
Western Asia
9.5
8.1
South-Central Asia
6.7
6.1
South-Eastern Asia
6.0
5.3
Gastric cancer is a major health problem in the world, especially in Asia. So it is needed to make appropriate policy making for allocation of resources for gastric cancer control and prevention. To achieve this aim an accurate registry system is needed, while there are some misclassifications in registering causes of death especially in developing countries[14,15]. Misclassification of causes of death has been a concern in cancer trends analysis and researches on cancer epidemiology for decades[14]. Misclassification error leads to under-estimation of cause specific mortality rates and consequently under-estimation in burden of disease and influences the policy makings and health risk prioritizations[10-12,37]. In the study of Khosravi et al[38], validated data from hospital death was used to measure the impact of misclassification on rates of cardiovascular disease mortality. But they didn’t employ Bayesian method. Bayesian approach has received much attention to correct for misclassification in mortality data. Whittemore and Gong[39] used a Bayesian approach to estimate cervical cancer mortality rates and Sposto et al[40] developed maximum likelihood method for assessing the effect of diagnostic misclassification on non-cancer and cancer mortality in atomic-bomb survivors. Stamey et al[41] provided a Bayesian approach, which extends the models introduced by Whittemore and Gong[39] and Sposto et al[40]. They assume that the misclassification parameters are unknown. They used the prior information on the misclassification parameters instead of using valid data. They applied their Bayesian approach for estimating the number of deaths due to cancer and non-cancer after correcting for misclassification in registering causes of deaths among survivors of Hiroshima and Nagasaki after atomic bombings[41]. Pourhoseingholi et al[42] extended the models proposed by Stamey et al[41] to re-estimate the rates of cause specific deaths in cancer registry data after correcting for misclassification[25,42,43]. Based on his study on gastric cancer mortality in Iranian population from 1995 to 2004, there were between 30%-40% misclassification in recording deaths due to gastric cancer[44]. The current study reveals that the accuracy of death registration in Iran is getting better in recent years.
In conclusion there is an undercount of gastric cancer mortality in Iranian registration system Because of misclassification error in registering causes of death. Although it seems that the misclassification rate has been reduced, it still exists as a major problem. So, policy makers who use mortality data to determine priorities for disease control and prevention, should notice to this underreported data and registration of causes of deaths should be done more accurately. Increase in data accuracy, requires more expert staffing, refining foundations, and powerful hardware and software resources[45]. In the absence of valid data, Bayesian approach is a good and flexible alternative to reduce the effects of Misclassification in registered cancer mortality data.
COMMETNS
Background
Mortality data registries are subject to misclassification; because some deaths assigned to causes that cannot considered as underlying death cause. For example if mortality due to a special cancer be registered as cancer without mentioning the type of cancer, misclassification error occurs. The aim of this study is to estimate the rate of misclassification in registering deaths due to gastric cancer in cancer without label group using a Bayesian method and re-estimate the rate of gastric cancer mortality in Iran.
Research frontiers
In Iran, death registries data is subject to misclassification. Reviewing the medical records or verbal autopsy as a practical solution for misclassification is time consuming. The hotspot of this study is using the Bayesian method for estimating the rate of misclassification in registering causes of death, which is rapid and cost-effective.
Innovations and breakthroughs
By using the Bayesian method, it is not needed to valid the data for estimating the rate of misclassification. Data validation is very costly and time consuming and in many cases it is not possible to obtain valid data. For implementing the Bayesian method only prior information about the misclassification rate is enough.
Applications
Since registered mortality data is used for health policy making and estimating the burden of disease, after correcting the misclassification in death registry system, more precise estimates of death rates and cause specific burden of disease will be achieved. Consequently there will be a better planning for disease control and prevention.
Terminology
Misclassification is lack of agreement between the observed value and the true value in categorical data. Bayesian method is one of the statistical approaches that assign a distribution or a probability to events or parameters based on previous experience or an expert’s idea and revise those probabilities and distributions after obtaining experimental data with applying Bayes’ theorem.
Peer-review
This is an interesting research.
Footnotes
Manuscript source: Invited manuscript
Specialty type: Gastroenterology and hepatology
Country of origin: Iran
Peer-review report classification
Grade A (Excellent): 0
Grade B (Very good): B
Grade C (Good): C
Grade D (Fair): D, D
Grade E (Poor): 0
P- Reviewer: Aoyagi K, Deans C, Lee HC, Shen LZ S- Editor: Kong JX L- Editor: A E- Editor: Lu YJ
Pourhoseingholi MA, Moghimi-Dehkordi B, Safaee A, Hajizadeh E, Solhpour A, Zali MR. Prognostic factors in gastric cancer using log-normal censored regression model.Indian J Med Res. 2009;129:262-267.
[PubMed] [DOI][Cited in This Article: ]
Pourhoseingholi MA, Vahedi M, Moghimi-Dehkordi B, Pourhoseingholi A, Ghafarnejad F, Maserat E, Safaee A, Mansoori BK, Zali MR. Burden of hospitalization for gastrointestinal tract cancer patients - Results from a cross-sectional study in Tehran.Asian Pac J Cancer Prev. 2009;10:107-110.
[PubMed] [DOI][Cited in This Article: ]
Sharifian A, Pourhoseingholi MA, Baghestani A, Hajizadeh N, Gholizadeh S. Burden of gastrointestinal cancers and problem of the incomplete information; how to make up the data?Gastroenterol Hepatol Bed Bench. 2016;9:12-17.
[PubMed] [DOI][Cited in This Article: ]
Mathers CD, Fat DM, Inoue M, Rao C, Lopez AD. Counting the dead and what they died from: an assessment of the global status of cause of death data.Bull World Health Organ. 2005;83:171-177.
[PubMed] [DOI][Cited in This Article: ]
Selikoff IJ, Seidman H. Use of death certificates in epidemiological studies, including occupational hazards: variations in discordance of different asbestos-associated diseases on best evidence ascertainment.Am J Ind Med. 1992;22:481-492.
[PubMed] [DOI][Cited in This Article: ]
Yavari P, Sadrolhefazi B, Mohagheghi MA, Madani H, Mosavizadeh A, Nahvijou A, Mehrabi Y, Pourhseingholi MA. An epidemiological analysis of cancer data in an Iranian hospital during the last three decades.Asian Pac J Cancer Prev. 2008;9:145-150.
[PubMed] [DOI][Cited in This Article: ]
Lopez AD, Murray CJ, editors . The global burden of disease: a comprehensive assessment of mortality and disability from diseases, injuries, and risk factors in 1990 and projected to 2020.
Harvard: Harvard School of Public Health 1996; .
[PubMed] [DOI][Cited in This Article: ]
Johansson LA, Pavillon G, Anderson R, Glenn D, Griffiths C, Hoyert D, Jackson G, Notzon FS, Rooney C, Rosenberg HM. Counting the dead and what they died of.Bull World Health Organ. 2006;84:254.
[PubMed] [DOI][Cited in This Article: ]
Lyles RH. A note on estimating crude odds ratios in case-control studies with differentially misclassified exposure.Biometrics. 2002;58:1034-1036.
[PubMed] [DOI][Cited in This Article: ]
Khosravi A, Aghamohamadi S, Kazemi E, Pour Malek F, Shariati M. Mortality Profile in Iran (29 Provinces) over the Years 2006 to 2010.
Tehran: Ministry of Health and Medical Education 2013; .
[PubMed] [DOI][Cited in This Article: ]
Pourhoseingholi MA, Faghihzadeh S, Hajizadeh E, Abadi A, Zali MR. Bayesian estimation of colorectal cancer mortality in the presence of misclassification in Iran.Asian Pac J Cancer Prev. 2009;10:691-694.
[PubMed] [DOI][Cited in This Article: ]
Howson CP, Hiyama T, Wynder EL. The decline in gastric cancer: epidemiology of an unplanned triumph.Epidemiol Rev. 1986;8:1-27.
[PubMed] [DOI][Cited in This Article: ]
Pisani P, Parkin DM, Bray F, Ferlay J. Estimates of the worldwide mortality from 25 cancers in 1990.Int J Cancer. 1999;83:18-29.
[PubMed] [DOI][Cited in This Article: ]
Pourhoseingholi MA, Vahedi M, Baghestani AR. Burden of gastrointestinal cancer in Asia; an overview.Gastroenterol Hepatol Bed Bench. 2015;8:19-27.
[PubMed] [DOI][Cited in This Article: ]
Whittemore AS, Gong G. Poisson regression with misclassified counts: application to cervical cancer.J R Stat Soc Ser C Appl Stat. 1991;40:81-93.
[PubMed] [DOI][Cited in This Article: ]
Pourhoseingholi MA, Abadi A, Faghihzadeh S, Pourhoseingholi A, Vahedi M, Moghimi-Dehkordi B, Safaee A, Zali MR. Bayesian analysis of esophageal cancer mortality in the presence of misclassification.Italian Journal of Public Health. 2012;8.
[PubMed] [DOI][Cited in This Article: ]
Pourhoseingholi MA. Bayesian adjustment for misclassification in cancer registry data.Transl Gastrointest Cancer. 2014;3:144-148.
[PubMed] [DOI][Cited in This Article: ]
Pourhoseingholi MA, Faghihzadeh S, Hajizadeh E, Abadi A. Bayesian analysis of gastric cancer mortality in Iranian population.Gastroenterol Hepatol Bed Bench. 2010;3:15-18.
[PubMed] [DOI][Cited in This Article: ]
Lankarani KB, Khosravizadegan Z, Rezaianzadeh A, Honarvar B, Moghadami M, Faramarzi H, Mahmoodi M, Farahmand M, Masoompour SM, Nazemzadegan B. Data coverage of a cancer registry in southern Iran before and after implementation of a population-based reporting system: a 10-year trend study.BMC Health Serv Res. 2013;13:169.
[PubMed] [DOI][Cited in This Article: ][Cited by in Crossref: 8][Cited by in F6Publishing: 10][Article Influence: 0.9][Reference Citation Analysis (0)]