Abstract
Using a capture-recapture approach, Bohning and Patilea (2008) proposed two useful estimators for unobserved cell counts, assuming homogeneous association of the screening tests over disease status. However, they are mistaken in claiming that the maximum likelihood estimators (MLEs) are difficult to obtain. The point of this note is to present closed-form MLEs for, in their notation: 1) the α model where α=p11(i)p00(i)/(p01(i)p10(i)) is assumed to be identical for all i = 1, 2, …, d; and 2) the θ model where θ=p1|1(i)/p1+(i) is assumed to be identical for all i. One way to write the likelihood function (ignoring constant terms) in this setting is in terms of qi and pjk(i)(i=1,2,…,d;j=0,1;k=0,1), as the authors did: x00(+)log(∑ip00(i)qi)+∑ix11(i)log(p11(i)qi)+∑ix10(i)log(p10(i)qi)+∑ix01(i)log(p01(i)qi). (1) This parameterization involves a mixture likelihood, preventing closed-form solution for the MLEs. To obtain closed-form MLEs, we consider an alternative parameterization in terms of πjk and πjk(i)(j,k=0,1andi=1,2,…,d) where πjk = P(T1 = j, T2 = k), πjk(i)=P(D=i|T1=j,T2=k). The log-likelihood function is (ignoring constant terms), logL=∑j∑kxjk(+)log(πjk)+∑ix11(i)log(π11(i))+∑ix10(i)log(π10(i))+∑ix01(i)log(π01(i))=x00(+)log(π00)+∑dx11(i){log(π11(i))+log(π11)}+∑ix10(i){log(π10(i))+log(π10)}+∑ix01(i){log(π01(i))+log(π01)}=x00(+)log(π00)+∑ix11(i)log(π11(i)π11)+∑ix10(i)log(π10(i)π10)+∑ix01(i)log(π01(i)π01) (2) This representation relates to previous work in some other settings (Satten and Kupper 1993; Lyles 2002; Pepe and Janes 2007). Note that, πjk(i)πjk=P(D=i|T1=j,T2=k)P(T1=j,T2=k)=P(T1=j,T2=k|D=i)P(D=i)=pjk(i)qi, and π00=P(T1=0,T2=0)=∑iP(T1=0,T2=0,D=i)=∑iP(T1=0,T2=0|D=i)P(D=i)=∑ip00(i)qi Therefore equation (2) is equivalent to equation (1). These equations are tractable and yield closed-form MLEs of πjk (j, k =0, 1) and πjk(i) if j + k > 0. Omitting the algebra, we obtain the MLEs as π`jk=xjk/n(j,k=0,1) and π`jk(i)=xjk(i)/xjk if j + k > 0. Therefore, the MLEs of qi s, which can be written as functions of πjk (j, k = 0, 1) and πjk(i)(j+k>0) under the α or θ model assumptions, have closed-form solutions. The details are given below. Under the α model, α=p11(i)p00(i)p01(i)p10(i) is assumed to be identical for all i = 1, 2, …, d; by Bayes’ rule, α=p11(i)p00(i)p01(i)p10(i)=P(T1=1,T2=1|D=i)P(T1=0,T2=0|D=i)P(T1=0,T2=1|D=i)P(T1=1,T2=0|D=i)=P(D=i|T1=1,T2=1)P(T1=1,T2=1)P(D=i|T1=0,T2=0)P(T1=0,T2=0)P(D=i|T1=1,T2=0)P(T1=1,T2=0)P(D=i|T1=0,T2=1)P(T1=0,T2=1)=π11π00π01π10×π11(i)π00(i)π01(i)π10(i), Thus α=π11π00π01π10×[∑iπ01(i)π10(i)π11(i)]−1,π00α(i)=π01(i)π10(i)π11(i)×[∑iπ01(i)π10(i)π11(i)]−1,qiα=π11π11(i)+π10π10(i)+π01π01(i)+π00π01(i)π10(i)π11(i)[∑iπ01(i)π10(i)π11(i)]−1, where the subscript α indicates the α model assumption. Since the MLEs of the parameters πjk and πjk(i) are π`jk=xjk/n(j,k=0,1) and π`jk(i)=xjk(i)/xjk if j + k > 0, the closed-form MLE of niα under the α model is n^iα=nq^iα=x11(i)+x10(i)+x01(i)+x00x01(i)x10(i)x11d[∑ix01(i)x10(i)x11(i)]−1, (3) which is essentially the same as the equation (15) in Bohning and Patilea (2008) without the stability correction. In other words, the estimator obtained in equation (15) is the MLE under the α model assumption with the stability correction. Under the θ model θ=p1|1(i)p1+(i) is assumed to be identical for all i = 1, 2, …, d; by Bayes’ rule θ=p1|1(i)p1+(i)=P(T1=1|T2=1,D=i)P(T1=1|D=i)=P(D=i|T1=1,T2=1)P(T1=1,T2=1)P(T2=1,D=i)P(T1=1,D=i)×P(D=i)=π11π11(i)(π01π01(i)+π11π11(i))(π10π10(i)+π11π11(i))×P(D=i), Thus θ=[∑i(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))]−1 and π00θ(i)=1π00{(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))[∑i(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))]−1−π11π11(i)−π10π10(i)−π01π01(i)} qiθ=(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))[∑i(π10π10(i)π11π11(i)+1)(π01π01(i)+π11π11(i))]−1, where the subscript θ indicates the θ model assumption. Similarly, the closed-form MLE of under niθ under the θ model is n^iθ=nq^iθ=(x10(i)x11(i)+1)(x01(i)+x11(i))[∑i(x10(i)x11(i)+1)(x01(i)+x11(i))]−1=x1+(i)x+1(i)x11(i)[∑ix1+(i)x+1(i)x11(i)]−1, (4) which is essentially the same as the equation (10) in Bohning and Patilea (2008) without the stability correction. As a byproduct of this alternative parameterization, we can test the difference between ^qiθ and ^qiα (or equivalently, the difference between ^niθ and ^niα) to make inference on whether these two assumptions provide statistically significantly different predictions for the probability (or equivalently, the number) of individuals with certain disease class i. Although the formula for se(^qiθ − ^qiα) is tedious, its numerical value can be obtained easily through statistical software using the delta method. We note that the difference between estimated probabilities of disease classes under the α and θ models can be statistically different and potentially meaningful for the same study. For example, in the Health Insurance Plan Study for breast cancer screening in New York (Strax, Venet Shapiro and Gross 1967), the estimated probability of having cancer assuming the α model is 4.8% with a 95% confidence interval (CI) of 0.3% to 9.3%, while the estimated probability of having cancer assuming the θ model is 7.5% with 95% CI of 2.8% to 12.2%. The difference is 2.7% (95% CI: 1.4% to 4%) with a p-value less than 0.001. This difference can have a big impact on the cancer surveillance and prevention. Unfortunately, the data does not contain information to differentiate the α model versus the θ model. The alternative parameterization in (2) sheds lights on maximum likelihood approaches in the setting considered here; the corresponding closed-form ML estimators under the α and θ models allow tests of the difference between the estimated probabilities of a specific disease class using the α versus the θ model. Our results complements the estimators obtained in equations (10) and (15) by Bohning and Patilea (2008) using a capture-recapture approach, and ensure the usual MLE properties.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.