Abstract
Technological advancements in recent years have sparked the use of large databases for research. The availability of these large databases has administered a need for anonymization and de-identification techniques, prior to publishing the data. This de-identification alters the data, which in turn can impact the results derived post de-identification and potentially lead to false conclusions. The objective of this study is to investigate if alterations to a de-identified time-to-event data set may improve the accuracy of the estimates. In this data set, a missing time bias was present among censored patients as a means to preserve patient confidentiality. This study investigates five methods intended to reduce the bias of time-to-event estimates. A simulation study was conducted to evaluate the effectiveness of each method in reducing bias. In situations where there was a large number of censored patients, the results of the simulation showed that Method 4 yielded the most accurate estimates. This method adjusted the survival times of censored patients by adding a random uniform component such that the modified survival time would occur within the final year of the study. Alternatively, when there was only a small number of censored patients, the method that did not alter the de-identified data set (Method 1) provided the most accurate estimates.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.