Abstract

BackgroundMissing data on tumour stage information is a common problem in population-based cancer registries. Statistical analyses on the level of tumour stage may be biased, if no adequate method for handling of missing data is applied. In order to determine a useful way to treat missing data on tumour stage, we examined different imputation models for multiple imputation with chained equations for analysing the stage-specific numbers of cases of malignant melanoma and female breast cancer.MethodsThis analysis was based on the malignant melanoma data set and the female breast cancer data set of the cancer registry Schleswig-Holstein, Germany. The cases with complete tumour stage information were extracted and their stage information partly removed according to a MAR missingness-pattern, resulting in five simulated data sets for each cancer entity. The missing tumour stage values were then treated with multiple imputation with chained equations, using polytomous regression, predictive mean matching, random forests and proportional sampling as imputation models. The estimated tumour stages, stage-specific numbers of cases and survival curves after multiple imputation were compared to the observed ones.ResultsThe amount of missing values for malignant melanoma was too high to estimate a reasonable number of cases for each UICC stage. However, multiple imputation of missing stage values led to stage-specific numbers of cases of T-stage for malignant melanoma as well as T- and UICC-stage for breast cancer close to the observed numbers of cases. The observed tumour stages on the individual level, the stage-specific numbers of cases and the observed survival curves were best met with polytomous regression or predictive mean matching but not with random forest or proportional sampling as imputation models.ConclusionsThis limited simulation study indicates that multiple imputation with chained equations is an appropriate technique for dealing with missing information on tumour stage in population-based cancer registries, if the amount of unstaged cases is on a reasonable level.

Highlights

  • Missing data on tumour stage information is a common problem in population-based cancer registries

  • The T-classification as well as the UICC-classification consists of four main categories, with stage I having a good survival prognosis and stage IV a poor prognosis

  • Six percent of the cases had no information on the T-stage, 11% had missing values in the N-stage and 16% in the M-stage

Read more

Summary

Introduction

Missing data on tumour stage information is a common problem in population-based cancer registries. Time trend analysis of cancer incidence is an important indicator in such an evaluation and is often conducted. Time trend analysis of tumour stage-specific incidence is more appropriate, less frequently applied [1,2,3]. Missing is often not known at the time of diagnosis and if the case is reported to the registry without additional notification, e.g. from the physician or from the pathologists, stage information is lost. Some cancer cases are only reported by a pathologist These notifications - in general - do not provide any information on lymph node status or metastasis

Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.