When the distribution of the truncation time is known up to a finite-dimensional parameter vector, many researches have been conducted with the objective to improve the efficiency of estimation for nonparametric or semiparametric model with left-truncated and right-censored (LTRC) data. When the distribution of truncation times is unspecified, one approach is to use the conditional maximum likelihood estimators (cMLE) (Chen and Shen in Lifetime Data Anal https://doi.org/10.1007/s10985-016-9385-9, 2017). Although the cMLE has nice asymptotic properties, it is not efficient since the conditional likelihood function does not incorporate information on the distribution of truncation time. In this article, we aim to develop a more efficient estimator by considering the full likelihood function. Following Turnbull (J R Stat Soc B 38:290–295, 1976) and Qin et al. (J Am Stat Assoc 106:1434–1449, 2011), we treat the unobserved (left-truncated) subpopulation as missing data and propose a two-stage approach for obtaining the pseudo maximum likelihood estimators (PMLE) of regression parameters. In the first stage, the distribution of left truncation time is estimated by the inverse-probability-weighted (IPW) estimator (Wang in J Am Stat Assoc 86:130–143, 1991). In the second stage, we obtain the pseudo complete-data likelihood function by replacing the distribution of truncation time with the IPW estimator in the full likelihood. We propose an expectation–maximization algorithm for obtaining the PMLE and establish the consistency of the PMLE. Simulation results show that the PMLE outperforms the cMLE in terms of mean squared error. The PMLE can also be used to analyze the length-biased data, where the truncation time is uniformly distributed. We demonstrate that the PMLE works more robust against the support assumption of truncation time for length-biased data compared with the MLE proposed by Qin et al. (2011). We apply our proposed method to the channing house data. While the PMLE is quite appealing under specific cases with independent censoring and time-invariant covariates, its applicability, as shown in simulation study, can be rather restricted for more general settings.
Read full abstract