A Few Remarks on “A Capture–Recapture Approach for Screening Using Two Diagnostic Tests With Availability of Disease Status for the Test Positives Only” by Böhning and Patilea

Haitao Chu,Lei Nie

doi:10.1198/016214508000000940

Abstract

Using a capture-recapture approach, Bohning and Patilea (2008) proposed two useful estimators for unobserved cell counts, assuming homogeneous association of the screening tests over disease status. However, they are mistaken in claiming that the maximum likelihood estimators (MLEs) are difficult to obtain. The point of this note is to present closed-form MLEs for, in their notation: 1) the α model where α=p11(i)p00(i)/(p01(i)p10(i)) is assumed to be identical for all i = 1, 2, …, d; and 2) the θ model where θ=p1|1(i)/p1+(i) is assumed to be identical for all i. One way to write the likelihood function (ignoring constant terms) in this setting is in terms of qi and pjk(i)(i=1,2,…,d;j=0,1;k=0,1), as the authors did: x00(+)log(∑ip00(i)qi)+∑ix11(i)log(p11(i)qi)+∑ix10(i)log(p10(i)qi)+∑ix01(i)log(p01(i)qi). (1) This parameterization involves a mixture likelihood, preventing closed-form solution for the MLEs. To obtain closed-form MLEs, we consider an alternative parameterization in terms of πjk and πjk(i)(j,k=0,1andi=1,2,…,d) where πjk = P(T1 = j, T2 = k), πjk(i)=P(D=i|T1=j,T2=k). The log-likelihood function is (ignoring constant terms), logL=∑j∑kxjk(+)log(πjk)+∑ix11(i)log(π11(i))+∑ix10(i)log(π10(i))+∑ix01(i)log(π01(i))=x00(+)log(π00)+∑dx11(i){log(π11(i))+log(π11)}+∑ix10(i){log(π10(i))+log(π10)}+∑ix01(i){log(π01(i))+log(π01)}=x00(+)log(π00)+∑ix11(i)log(π11(i)π11)+∑ix10(i)log(π10(i)π10)+∑ix01(i)log(π01(i)π01) (2) This representation relates to previous work in some other settings (Satten and Kupper 1993; Lyles 2002; Pepe and Janes 2007). Note that, πjk(i)πjk=P(D=i|T1=j,T2=k)P(T1=j,T2=k)=P(T1=j,T2=k|D=i)P(D=i)=pjk(i)qi, and π00=P(T1=0,T2=0)=∑iP(T1=0,T2=0,D=i)=∑iP(T1=0,T2=0|D=i)P(D=i)=∑ip00(i)qi Therefore equation (2) is equivalent to equation (1). These equations are tractable and yield closed-form MLEs of πjk (j, k =0, 1) and πjk(i) if j + k > 0. Omitting the algebra, we obtain the MLEs as π`jk=xjk/n(j,k=0,1) and π`jk(i)=xjk(i)/xjk if j + k > 0. Therefore, the MLEs of qi s, which can be written as functions of πjk (j, k = 0, 1) and πjk(i)(j+k>0) under the α or θ model assumptions, have closed-form solutions. The details are given below. Under the α model, α=p11(i)p00(i)p01(i)p10(i) is assumed to be identical for all i = 1, 2, …, d; by Bayes’ rule, α=p11(i)p00(i)p01(i)p10(i)=P(T1=1,T2=1|D=i)P(T1=0,T2=0|D=i)P(T1=0,T2=1|D=i)P(T1=1,T2=0|D=i)=P(D=i|T1=1,T2=1)P(T1=1,T2=1)P(D=i|T1=0,T2=0)P(T1=0,T2=0)P(D=i|T1=1,T2=0)P(T1=1,T2=0)P(D=i|T1=0,T2=1)P(T1=0,T2=1)=π11π00π01π10×π11(i)π00(i)π01(i)π10(i), Thus α=π11π00π01π10×[∑iπ01(i)π10(i)π11(i)]−1,π00α(i)=π01(i)π10(i)π11(i)×[∑iπ01(i)π10(i)π11(i)]−1,qiα=π11π11(i)+π10π10(i)+π01π01(i)+π00π01(i)π10(i)π11(i)[∑iπ01(i)π10(i)π11(i)]−1, where the subscript α indicates the α model assumption. Since the MLEs of the parameters πjk and πjk(i) are π`jk=xjk/n(j,k=0,1) and π`jk(i)=xjk(i)/xjk if j + k > 0, the closed-form MLE of niα under the α model is n^iα=nq^iα=x11(i)+x10(i)+x01(i)+x00x01(i)x10(i)x11d[∑ix01(i)x10(i)x11(i)]−1, (3) which is essentially the same as the equation (15) in Bohning and Patilea (2008) without the stability correction. In other words, the estimator obtained in equation (15) is the MLE under the α model assumption with the stability correction. Under the θ model θ=p1|1(i)p1+(i) is assumed to be identical for all i = 1, 2, …, d; by Bayes’ rule θ=p1|1(i)p1+(i)=P(T1=1|T2=1,D=i)P(T1=1|D=i)=P(D=i|T1=1,T2=1)P(T1=1,T2=1)P(T2=1,D=i)P(T1=1,D=i)×P(D=i)=π11π11(i)(π01π01(i)+π11π11(i))(π10π10(i)+π11π11(i))×P(D=i), Thus θ=[∑i(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))]−1 and π00θ(i)=1π00{(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))[∑i(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))]−1−π11π11(i)−π10π10(i)−π01π01(i)} qiθ=(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))[∑i(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))]−1, where the subscript θ indicates the θ model assumption. Similarly, the closed-form MLE of under niθ under the θ model is n^iθ=nq^iθ=(x10(i)x11(i)+1)(x01(i)+x11(i))[∑i(x10(i)x11(i)+1)(x01(i)+x11(i))]−1=x1+(i)x+1(i)x11(i)[∑ix1+(i)x+1(i)x11(i)]−1, (4) which is essentially the same as the equation (10) in Bohning and Patilea (2008) without the stability correction. As a byproduct of this alternative parameterization, we can test the difference between ^qiθ and ^qiα (or equivalently, the difference between ^niθ and ^niα) to make inference on whether these two assumptions provide statistically significantly different predictions for the probability (or equivalently, the number) of individuals with certain disease class i. Although the formula for se(^qiθ − ^qiα) is tedious, its numerical value can be obtained easily through statistical software using the delta method. We note that the difference between estimated probabilities of disease classes under the α and θ models can be statistically different and potentially meaningful for the same study. For example, in the Health Insurance Plan Study for breast cancer screening in New York (Strax, Venet Shapiro and Gross 1967), the estimated probability of having cancer assuming the α model is 4.8% with a 95% confidence interval (CI) of 0.3% to 9.3%, while the estimated probability of having cancer assuming the θ model is 7.5% with 95% CI of 2.8% to 12.2%. The difference is 2.7% (95% CI: 1.4% to 4%) with a p-value less than 0.001. This difference can have a big impact on the cancer surveillance and prevention. Unfortunately, the data does not contain information to differentiate the α model versus the θ model. The alternative parameterization in (2) sheds lights on maximum likelihood approaches in the setting considered here; the corresponding closed-form ML estimators under the α and θ models allow tests of the difference between the estimated probabilities of a specific disease class using the α versus the θ model. Our results complements the estimators obtained in equations (10) and (15) by Bohning and Patilea (2008) using a capture-recapture approach, and ensure the usual MLE properties.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Few Remarks on “A Capture–Recapture Approach for Screening Using Two Diagnostic Tests With Availability of Disease Status for the Test Positives Only” by Böhning and Patilea

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association

Lead the way for us

Journal: Journal of the American Statistical Association	Publication Date: Dec 1, 2008
Citations: 11

Similar Papers

Statistical Methods to Study Timing of Vulnerability with Sparsely Sampled Data on Environmental Toxicants
Brisa Ney Sánchez ... Howard Hu
Environmental Health Perspectives | VOL. 119
Brisa Ney Sánchez, et. al.Brisa Ney Sánchez ... Howard Hu
08 Dec 2010
Environmental Health Perspectives | VOL. 119

Confirmatory Factor Analyses in Psychological Test Adaptation and Development
Kay Brauer ... Matthias Ziegler
Psychological Test Adaptation and Development | VOL. 4
Kay Brauer, et. al.Kay Brauer ... Matthias Ziegler
01 Feb 2023
Psychological Test Adaptation and Development | VOL. 4

Maximum likelihood method
Andreas Ziegler
-
Andreas ZieglerAndreas Ziegler
01 Jan 2010
01 Jan 2010

Performance of the maximum likelihood constant frequency estimator for frequency tracking
M Karan ... B.D.O Anderson
IEEE Transactions on Signal Processing | VOL. 42
M Karan, et. al.M Karan ... B.D.O Anderson
01 Jan 1993
IEEE Transactions on Signal Processing | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Few Remarks on “A Capture–Recapture Approach for Screening Using Two Diagnostic Tests With Availability of Disease Status for the Test Positives Only” by Böhning and Patilea

Abstract

Talk to us

Similar Papers

More From: Journal of the American Statistical Association