Maximum simulated likelihood is the preferred estimator of most researchers who deal with discrete choice. It allows estimation of models such as mixed multinomial logit (MXL), generalized multinomial logit, or hybrid choice models, which have now become the state-of-practice in the microeconometric analysis of discrete choice data. All these models require simulation-based solving of multidimensional integrals, which can lead to several numerical problems. In this study, we focus on one of these problems – utilizing from 100 to 1,000,000 draws, we investigate the extent of the simulation bias resulting from using several different types of draws: (1) pseudo random numbers, (2) modified Latin hypercube sampling, (3) randomized scrambled Halton sequence, and (4) randomized scrambled Sobol sequence. Each estimation is repeated up to 1 000 times. The simulations use several artificial datasets based on an MXL data generating process with different numbers of individuals (400, 800, 1 200), different numbers of choice tasks per respondent (4, 8, 12), different number of attributes (5, 10), and different experimental designs (D-optimal, D-efficient for the MNL and D-efficient for the MXL model). Our large-scale simulation study allows for comparisons and drawing conclusions with respect to (1) how efficient different types of quasi Monte Carlo simulation methods are and (2) how many draws one should use to make sure the results are of “satisfying” quality – under different experimental conditions. Our study is the first to date to offer such a comprehensive comparison. Overall, we find that the number of the best-performing Sobol draws required for the desired precision exceeds 2 000 in some of the 5-attribute settings, and 20,000 in the case of some 10-attribute settings considered.
Read full abstract