Evidence evaluation is a crucial process in many human activities, spanning from medical diagnosis to impression formation. The present experiments investigated which, if any, normative model best conforms to people's intuition about the value of the obtained evidence. Psychologists, epistemologists, and philosophers of science have proposed several models to account for people's intuition about the utility of the obtained evidence with respect either to a focal hypothesis or to a constellation of hypotheses. We pitted against each other the so-called optimal-experimental-design models (i.e., Bayesian diagnosticity, log₁₀ diagnosticity, information gain, Kullback-Leibler distance, probability gain, and impact) and measures L and Z to compare their ability to describe humans' intuition about the value of the obtained evidence. Participants received words-and-numbers scenarios concerning 2 hypotheses and binary features. They were asked to evaluate the utility of "yes" and "no" answers to questions about some features possessed in different proportions (i.e., the likelihoods) by 2 types of extraterrestrial creatures (corresponding to 2 mutually exclusive and exhaustive hypotheses). Participants evaluated either how an answer was helpful or how an answer decreased/increased their beliefs with respect either to a single hypothesis or to both hypotheses. We fitted mixed-effects models and used the Akaike information criterion and the Bayesian information criterion values to compare the competing models of the value of the obtained evidence. Overall, the experiments showed that measure Z was the best fitting model of participants' judgments of the value of obtained answers. We discussed the implications for the human hypothesis-evaluation process.