In epidemiological studies, it is easier to collect data only from individuals whose failure events are within a calendar time interval, the so-called interval sampling, which leads to doubly truncated data. In many situations, the calendar time of the failure event can only be recorded within time intervals, leading to doubly truncated and interval censored (DTIC) data. Firstly, we point out that although the existing methods for DTIC data work adequately under the sampling scheme (Scheme 1) for doubly truncated data, Scheme 1 is not realistic for DTIC data. Secondly, we consider a commonly used sampling scheme (Scheme 2) , under which the individuals are included in the sample based on diagnosis date. We point out that under Scheme 2, due to violation of assumptions for Scheme 1, the NPMLE of the cumulative distribution function is severely biased if the likelihood function for Scheme 1 is used. To overcome this difficulty, we define a target population, under which a sampling scheme (Scheme 3) can be implemented such that appropriate truncation variables can be defined and the NPMLE of the cumulative distribution function can be obtained using the expectation-maximization algorithm. We also consider estimation of the joint distribution function for successive duration times. Using the imputed first failure times based on the NPMLE from Scheme 3, we then obtain the imputed right censored data of the second failure event. Based on the imputed data, we propose a nonparametric estimator of the joint distribution function using the inverse-probability-weighted approach. Simulation studies demonstrate that the proposed method performs well with moderate sample sizes.
Read full abstract