In manufacturing, we often use a binary measurement system (BMS) for 100% inspection to protect customers from receiving nonconforming product. We can assess the performance of a BMS by estimating the consumer's and producer's risks, the two misclassification rates. Here, we consider assessment plans and their analysis when a gold standard system (GSS) is available for the assessment study but is too expensive for everyday use. We propose a random-effects model to allow for variation in the misclassification rates within the populations of conforming and nonconforming parts. One possibility, here denoted the standard plan, is to randomly sample n parts and measure them once with the GSS and r times with the inspection system. We provide a simple analysis and planning advice for standard plans. In practice, the misclassification rates are often low and the underlying process has high capability. This combination of conditions makes the assessment of the BMS challenging. We show that we need a very large number of measurements with the standard plan in order to get precise estimators of the average misclassification rates and the true process performance. We consider an alternate design, here denoted the conditional assessment plan, where we select random samples from the sets of previously passed and failed parts. The sampled parts are measured once with the GSS and r times with the inspection system. When we augment the data from the conditional plans with available baseline information on the overall pass rate, we show that we can precisely estimate the parameters of interest with many fewer measurements. In the online supplementary materials, we provide R code to find maximum likelihood estimates and corresponding approximate standard errors, and to find the asymptotic standard deviation of the estimators with a selected plan size and assumed parameter values for both the standard and the conditional sampling plans.