Abstract

Electronic health records (EHRs) can be a cost-effective data source for forming cohorts and developing risk models in the context of disease screening. However, important issues need to be handled: competing outcomes, left-censoring of prevalent disease, interval-censoring of incident disease, and uncertainty of prevalent disease when accurate disease ascertainment is not conducted at baseline. Furthermore, novel tests that are costly and limited in availability can be conducted on stored biospecimens selected as samples from EHRs by using different sampling fractions. We extend sample-weighted semiparametric marginal mixture models to estimating competing risks. For flexible modeling of relative risks, a general transformation of the subdistribution hazard function and regression parameters is used. We propose a numerical algorithm for nonparametrically calculating the maximum likelihood estimates for subdistribution hazard functions and regression parameters. Methods for calculating the consistent confidence intervals for relative and absolute risk estimates are presented. The proposed algorithm and methods show reliable finite sample performance through simulation studies. We apply our methods to a cohort assembled from EHRs at a health maintenance organization where we estimate cumulative risk of cervical precancer/cancer and incidence of infection-clearance by HPV genotype among human papillomavirus (HPV) positive women. There is no significant difference in 3-year HPV-clearance rates across different HPV types, but 3-year cumulative risk of progression-to-precancer/cancer from HPV-16 is relatively higher than the other HPV genotypes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.