A comparison of Kaplan–Meier-based inverse probability of censoring weighted regression methods
Weighting with the inverse probability of censoring is an approach to deal with censoring in regression analyses where the outcome may be missing due to right-censoring. In this paper, three separate approaches involving this idea in a setting where the Kaplan–Meier estimator is used for estimating the censoring probability are compared. In more detail, the three approaches involve weighted regression, regression with a weighted outcome, and regression of a jack-knife pseudo-observation based on a weighted estimator. Expressions of the asymptotic variances are given in each case and the expressions are compared to each other and to the uncensored case. In terms of low asymptotic variance, a clear winner cannot be found. Which approach will have the lowest asymptotic variance depends on the censoring distribution. Expressions of the limit of the standard sandwich variance estimator in the three cases are also provided, revealing an overestimation under the implied assumptions.
- Research Article
19
- 10.1016/j.athoracsur.2011.12.094
- Apr 25, 2012
- The Annals of Thoracic Surgery
Review of Case-Mix Corrected Survival Curves
- Research Article
- 10.1080/03610920601076933
- May 2, 2007
- Communications in Statistics - Theory and Methods
This article considers a class of estimators for the location and scale parameters in the location-scale model based on ‘synthetic data’ when the observations are randomly censored on the right. The asymptotic normality of the estimators is established using counting process and martingale techniques when the censoring distribution is known and unknown, respectively. In the case when the censoring distribution is known, we show that the asymptotic variances of this class of estimators depend on the data transformation and have a lower bound which is not achievable by this class of estimators. However, in the case that the censoring distribution is unknown and estimated by the Kaplan–Meier estimator, this class of estimators has the same asymptotic variance and attains the lower bound for variance for the case of known censoring distribution. This is different from censored regression analysis, where asymptotic variances depend on the data transformation. Our method has three valuable advantages over the method of maximum likelihood estimation. First, our estimators are available in a closed form and do not require an iterative algorithm. Second, simulation studies show that our estimators being moment-based are comparable to maximum likelihood estimators and outperform them when sample size is small and censoring rate is high. Third, our estimators are more robust to model misspecification than maximum likelihood estimators. Therefore, our method can serve as a competitive alternative to the method of maximum likelihood in estimation for location-scale models with censored data. A numerical example is presented to illustrate the proposed method.
- Research Article
- 10.9790/0853-2309042334
- Sep 1, 2024
- IOSR Journal of Dental and Medical Sciences
Background: Ovarian cancer (oc) is the 3rd most common cancer (ca) in females in India (globocan 2020)1 , epithelial ovarian cancer consists of 90% of all oc. More than 70% of ovarian ca patients diagnosed at advanced stage due to its asymptomatic nature and insidious onset of the disease. Metastases(mets) remained a major cause of mortality in ovarian cancer patients. This study is being undertaken to identify prognostic and predictive factors associated with the survival of patients with metastatic ovarian cancer. Materials and methods: It is a retrospective study. Patients diagnosed with stage IV epithelial ovarian cancer (treatment naive), who attended the department of medical oncology, Govt Royapettah Hospital (GRH), Chennai during 2013- 2018 with regular follow up included. 5 years follow-up data collected till dec, 2023. Aim of this study is to identify the epidemiological, clinicopathological characteristics, treatment outcome and other prognostic factors predicting overall survival (os) and progression free survival (pfs) in patients with stage IV ovarian cancer. Results: In this study 85 patients with stage IV epithelial ovarian ca were analysed. In our study median os is 27 months, pfs is 13 months. Overall survival (os) rate at 2 years (yr)= 0.5647, overall survival (os) rate at 5 years = 0.105. Progression free survival (pfs) rate at 2 years = 0.1765, progression free survival (pfs) rate at 5 years = 0.0471. Kaplan Meier (KM) survival curves have shown response after 1 st line treatment, surgery vs no surgery, platinum sensitive recurrence have significant association with 2 year, 5 year os and pfs (p<0.05). Hpe (high grade serous vs less common ovarian cancers) has showed significant correlation with 5 year survival (p<0.05), univariate cox regression analysis has shown, number of mets (single site vs multiple site) significantly associated with improved 2 year survival (p=0.02) and pfs (p=0.005). Both univariate and multivariate analysis showed age (<=56 vs>56) as independent factor correlating with 5 year survival (p=0.02), (p=0.01), multivariate cox regression analysis showed pretreatment ca 125 is an independent variables for pfs (p=0.04). Conclusion: In our study KM survival curves have shown if patients could undergo cytoreductive surgery, had response after 1 st line treatment (surgery and chemo) , significant improvement in 2 year, 5 year os and pfs can be achieved, though not established in regression analysis. Recurrence rate remained very high in advanced stage, in our study KM survival curve has shown platinum sensitive recurrence has significantly better 2 year, 5 year os and pfs rate. Cox regression analysis showed patients with single site of mets have significantly better 2 year survival and pfs than patients with multiple site of mets. Pre treatment raised ca 125 found to be an independent factor which can predict poor pfs. Higher age is an independent factor found to impact 5 year os
- Research Article
- 10.1080/02664763.2023.2298795
- Dec 27, 2023
- Journal of Applied Statistics
We propose a non-parametric approach to reduce the overestimation of the Kaplan-Meier (KM) estimator when the event and censoring times are independent. We adjust the KM estimator based on the interval-specific censoring set, a collection of intervals where censored data are observed between two adjacent event times. The proposed interval-specific censoring set adjusted KM estimator reduces to the KM estimator if there are no censored observations or the sample size tends to infinity and the proposed estimator is consistent, as is the case for the KM estimator. We prove theoretically that the proposed estimator reduces the overestimation compared to the KM estimator and provide a mathematical formula to estimate the variance of the proposed estimator based on Greenwood's approach. We also provide a modified log-rank test based on the proposed estimator. We perform four simulation studies to compare the proposed estimator with the KM estimator when the failure rate is constant, decreasing, increasing, and based on the flexible hazard method. The bias reduction in median survival time and survival rate using the proposed estimator is considerably large, especially when the censoring rate is high. The standard deviations are comparable between the two estimators. We implement the proposed and KM estimator for the Nonalcoholic Fatty Liver Disease patients from a population study. The results show the proposed estimator substantially reduce the overestimation in the presence of high observed censoring rate.
- Research Article
3
- 10.1080/00949650903421085
- Apr 1, 2011
- Journal of Statistical Computation and Simulation
The Buckley–James estimator (BJE) is a widely recognized approach in dealing with right-censored linear regression models. There have been a lot of discussions in the literature on the estimation of the BJE as well as its asymptotic distribution. So far, no simulation has been done to directly estimate the asymptotic variance of the BJE. Kong and Yu [Asymptotic distributions of the Buckley–James estimator under nonstandard conditions, Statist. Sinica 17 (2007), pp. 341–360] studied the asymptotic distribution under discontinuous assumptions. Based on their methodology, we recalculate and correct some missing terms in the expression of the asymptotic variance in Theorem 2 of their work. We propose an estimator of the standard deviation of the BJE by using plug-in estimators. The estimator is shown to be consistent. The performance of the estimator is accessed through simulation studies under discrete underline distributions. We further extend our studies to several continuous underline distributions through simulation. The estimator is also applied to a real medical data set. The simulation results suggest that our estimation is a good approximation to the true standard deviation with reference to the empirical standard deviation.
- Research Article
18
- 10.1158/1538-7445.sabcs15-p5-08-02
- Feb 15, 2016
- Cancer Research
Background: The 21-Gene Recurrence Score® Assay (Oncotype DX®) has been validated as a prognostic and predictive tool in estrogen receptor (ER)+ breast cancer in multiple studies using archival specimens of clinical trials with long term follow up. Prospective outcome data from patients where treatment decisions incorporated the Recurrence Score results have not been reported. We evaluated treatments and clinical outcomes in patients undergoing Recurrence Score testing in 9 medical centers within Clalit Health Services (CHS), the largest HMO in Israel. Methods: Medical records of patients with N0/Nmic ER+ HER2-negative disease undergoing testing from 12/2004 to 12/2010 in 9 medical centers (Rabin, Lin, Soroka, Meir, Kaplan, Hadassah, Ha'emek, Rambam, and Shaare Zedek) within CHS were individually reviewed to verify treatments given, recurrence, and survival status. 5-year Kaplan-Meier (KM) and standard error estimates for distant recurrence and breast cancer specific survival were determined. Results: 1594 patients were evaluated with 5.9 years median follow-up. Median age, 61 (25-85) years; N0/Nmic (90%/10%); Grade I (16%), II (48%), III (16%), N/A (19%); histology, IDC (80%), lobular (13%), other (7%). Distribution of Recurrence Score risk groups (Recurrence Score results of &lt;18, 18-30, ≥31): low (51%), intermediate (38%), and high (11%), with chemotherapy (CT) use of 1%, 26%, and 89%, respectively. Distant recurrence was reported in 17/813, 33/612, and 24/169 patients in the low, intermediate, and high Recurrence Score groups, respectively. In the high Recurrence Score group, distant recurrence was reported in 20/150 (13.3%) of CT-treated patients and in 4/19 (21.1%) of untreated patients. In the intermediate Recurrence Score group, the respective values were 9/162 (5.6%) and 24/450 (5.3%). The 5-year KM estimate for distant recurrence rate was 1.4% (95% CI: 0.9-2.3%) for the entire cohort, and 0.5% (95% CI: 0.2-1.6%), 1.2% (95% CI: 0.6-2.8%), and 6.9% (95% CI: 3.7-12.9), for the low, intermediate, and high Recurrence Score groups, respectively. The 5-year KM estimate for breast cancer specific survival was 98.4% (95% CI: 97.6-98.9%) for the entire cohort, and 99.9% (95% CI: 99.0-99.98%), 98.5% (95% CI: 97.1-99.2%) and 90.6% (95% CI: 84.5-94.4%), for the low, intermediate, and high Recurrence Score groups, respectively. Conclusions: These are the first prospective long term clinical outcome data from approximately 1600 patients for whom the 21-gene Recurrence Score assay has been incorporated in real-life clinical decision making. The documented use of CT was appropriately based on the Recurrence Score result, and the outcomes for recurrence and survival are consistent with previously reported prospective-retrospective studies of the 21-gene assay. The 5 year KM estimates for distant recurrence rate in patients with low and intermediate Recurrence Score results who were treated based upon their Recurrence Score results were very low (0.5% and 1.2%, respectively). Citation Format: Stemmer SM, Steiner M, Rizel S, Soussan-Gutman L, Geffen DB, Nisenbaum B, Ben-Baruch N, Isaacs K, Fried G, Rosengarten O, Uziely B, Svedman C, Rothney M, Klang SH, Ryvo L, Kaufman B, Evron E, Zidan J, Shak S, Liebermann N. Real-life analysis evaluating 1594 N0/Nmic breast cancer patients for whom treatment decisions incorporated the 21-gene recurrence score result: 5-year KM estimate for breast cancer specific survival with recurrence score results ≤30 is &gt;98%. [abstract]. In: Proceedings of the Thirty-Eighth Annual CTRC-AACR San Antonio Breast Cancer Symposium: 2015 Dec 8-12; San Antonio, TX. Philadelphia (PA): AACR; Cancer Res 2016;76(4 Suppl):Abstract nr P5-08-02.
- Front Matter
5
- 10.1016/j.jtcvs.2020.03.156
- Apr 30, 2020
- The Journal of Thoracic and Cardiovascular Surgery
Guidelines for improving the use and presentation of P values
- Research Article
17
- 10.1016/j.spa.2020.05.006
- May 15, 2020
- Stochastic Processes and their Applications
Importance sampling correction versus standard averages of reversible MCMCs in terms of the asymptotic variance
- Research Article
7
- 10.1080/07474938.2016.1165945
- Jun 10, 2016
- Econometric Reviews
ABSTRACTThis article derives explicit expressions for the asymptotic variances of the maximum likelihood and continuously-updated GMM estimators in models that may not satisfy the fundamental asset-pricing restrictions in population. The proposed misspecification-robust variance estimators allow the researcher to conduct valid inference on the model parameters even when the model is rejected by the data. While the results for the maximum likelihood estimator are only applicable to linear asset-pricing models, the asymptotic distribution of the continuously-updated GMM estimator is derived for general, possibly nonlinear, models. The large corrections in the asymptotic variances, that arise from explicitly incorporating model misspecification in the analysis, are illustrated using simulations and an empirical application.
- Research Article
6
- 10.1080/01621459.1977.10481011
- Jun 1, 1977
- Journal of the American Statistical Association
Loglinear models are classified as direct or indirect depending on whether the maximum likelihood estimates of cell values are explicit functions of the sufficient statistics or not. For saturated (hence, direct) models, Goodman (1970) and Bishop, Fienberg, and Holland (1975) used the δ method to calculate the asymptotic variances of various u terms in the loglinear models. In the present paper, this approach has been generalized to direct unsaturated hierarchical loglinear models. General rules for determining closed form expressions for asymptotic variances in such situations are obtained; bounds for the asymptotic variances of u terms in indirect models are considered; and these rules are compared with other methods of producing asymptotic variances.
- Research Article
36
- 10.1214/009053606000000065
- Apr 1, 2006
- The Annals of Statistics
A model for competing (resp. complementary) risks survival data where the failure time can be left (resp. right) censored is proposed. Product-limit estimators for the survival functions of the individual risks are derived. We deduce the strong convergence of our estimators on the whole real half-line without any additional assumptions and their asymptotic normality under conditions concerning only the observed distribution. When the observations are generated according to the double censoring model introduced by Turnbull, the product-limit estimators represent upper and lower bounds for Turnbull's estimator.
- Research Article
4
- 10.1081/sta-200047455
- Feb 1, 2005
- Communications in Statistics: Theory and Methods
Under the random censorship model with extra assumption that all censoring times are known, Chiu (1999) proposed a new estimator for the survival function of failure time. In this article, we explore the most likely uniformly consistent estimators of the survival function in such data settings and study large sample properties of these estimators. It turns out that the classical Kaplan-Meier estimator is of the smallest asymptotic variance among these possible estimators. This fact confirms the optimality of the Kaplan-Meier estimator in term of asymptotic variance and may suggest that the Kaplan-Meier estimator is the most recommendable even in such circumstances.
- Research Article
- 10.1080/03610920509342432
- Jan 1, 2005
- Communications in Statistics - Theory and Methods
Under the random censorship model with extra assumption that all censoring times are known, Chiu (1999) proposed a new estimator for the survival function of failure time. In this article, we explore the most likely uniformly consistent estimators of the survival function in such data settings and study large sample properties of these estimators. It turns out that the classical Kaplan-Meier estimator is of the smallest asymptotic variance among these possible estimators. This fact confirms the optimality of the Kaplan-Meier estimator in term of asymptotic variance and may suggest that the Kaplan-Meier estimator is the most recommendable even in such circumstances.
- Research Article
1
- 10.1080/10485252.2016.1225738
- Aug 30, 2016
- Journal of Nonparametric Statistics
ABSTRACTUnder the standard right-censorship (RC) model, which assumes that the survival time and censoring time are independent, several sufficient conditions have been established for the product-limit estimator (PLE) being asymptotically normally distributed on the whole real line [see, e.g. Stute, W. (1995), ‘The central limit theorem under random censorship’, Ann. Statist., 23, 422–439]. However, it remains a difficult open problem what the necessary and sufficient conditions that the PLE has an asymptotic normality property on the whole real line is. In this paper, we settle the problem under both the standard RC model which assumes and the dependent RC model.
- Research Article
- 10.3390/math12060905
- Mar 19, 2024
- Mathematics
The only cases where exact distributions of estimates are known is for samples from exponential families, and then only for special functions of the parameters. So statistical inference was traditionally based on the asymptotic normality of estimates. To improve on this we need the Edgeworth expansion for the distribution of the standardised estimate. This is an expansion in n−1/2 about the normal distribution, where n is typically the sample size. The first few terms of this expansion were originally given for the special case of a sample mean. In earlier work we derived it for any standard estimate, hugely expanding its application. We define an estimate w^ of an unknown vector w in Rp, as a standard estimate, if Ew^→w as n→∞, and for r≥1 the rth-order cumulants of w^ have magnitude n1−r and can be expanded in n−1. Here we present a significant extension. We give the expansion of the distribution of any smooth function of w^, say t(w^) in Rq, giving its distribution to n−5/2. We do this by showing that t(w^), is a standard estimate of t(w). This provides far more accurate approximations for the distribution of t(w^) than its asymptotic normality.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.