A Simple Divide-and-Conquer-based Distributed Method for the Accelerated Failure Time Model

Lanjue Chen,Jin Su,Alan T K Wan,Yong Zhou

doi:10.1080/10618600.2023.2252028

Abstract

The accelerated failure time (AFT) model is an appealing tool in survival analysis because of its ease of interpretation, but when there is a large volume of data, fitting an AFT model and carrying out the associated inference on one computer can be computationally demanding. This poses a severe limitation for the application of the AFT model in the face of big data. The article addresses this problem by developing a simple distributed method for estimating the parameters of an AFT model based on the divide-and-conquer strategy, which has the dual benefits of statistical efficiency and computational economy. It is an iterative method that involves, for the most part, some rather simple algebraic operations, except for obtaining the initial estimate, which is based on a smoothed approximation of the Gehan estimating equation. Our results show that the proposed method yields estimates that converge after a few iterations and an estimator that is asymptotically as efficient as the benchmark estimator obtained by using the full data in one go. We also develop an associated inference procedure. The merits of the proposed method are demonstrated via an extensive simulation study. The method is applied to a kidney transplantation dataset. Supplementary materials for this article are available online.

Full Text