To empirically compare four preference elicitation approaches, the discrete choice experiment with time (DCETTO), the Best-Worst Scaling with time (BWSTTO), DCETTO with BWSTTO (DCEBWS), and the Standard Gamble (SG) method, in valuing health states using the SF-6Dv2. A representative sample of the general population in Quebec, Canada, completed 6 SG tasks or 13 DCEBWS (i.e., 10 DCETTO followed by 3 BWSTTO). Choice tasks were designed with the SF-6Dv2. Several models were used to estimate SG data, and the conditional logit model was used for the DCE or BWS data. The performance of SG models was assessed using prediction accuracy (mean absolute error [MAE]), goodness of fit using Bayesian information criterion (BIC),t-test, Jarque-Bera (JB) test, Ljung-Box (LB) test, the logical consistency of the parameters, and significance levels. Comparison between approaches was conducted using acceptability (self-reported difficulty and quality levels in answering, and completion time), consistency (monotonicity of model coefficients), accuracy (standard errors), dimensions coefficient magnitude, correlation between the value sets estimated, and the range of estimated values. The variance scale factor was computed to assess individuals' consistency in their choices for DCE and BWS approaches. Out of 828 people who completed SG and 1208 for DCEBWS tasks, a total of 724 participants for SG and 1153 for DCE tasks were included for analysis. Although no significant difference was observed in self-reported difficulties and qualities in answers among approaches, the SG had the longest completion time and excluded participants in SG were more prone to report difficulties in answering. The range of standard errors of the SG was the narrowest (0.012 to 0.015), followed by BWSTTO (0.023 to 0.035), DCEBWS (0.028 to 0.050), and DCETTO (0.028 to 0.052). The highest number of insignificant and illogical parameters was for BWSTTO. Pain dimension was the most important across dimensions in all approaches. The correlation between SG and DCEBWS utility values was the strongest (0.928), followed by the SG and BWSTTO values (0.889), and the SG and DCETTO (0.849). The range of utility values generated by SG tended to be shorter (-0.143 to 1) than those generated by the other three methods, whereas BWSTTO (-0.505 to 1) range values were shorter than DCETTO (-1.063 to 1) and DCEBWS (-0.637 to 1). The variance scale factor suggests that respondents had almost similar level of certainty or confidence in both DCE and BWS responses. The SG had the narrowest value set, the lowest completion rates, the longest completion time, the best prediction accuracy, and produced an unexpected sign for one level. The BWSTTO had a narrower value set, lower completion time, higher parameter inconsistency, and higher insignificant levels compared to DCETTO and DCEBWS. The results of DCEBWS were more similar to SG in number of insignificant and illogical parameters, and correlation.
Read full abstract