Problem definition: Studies have shown that the behavior of subjects in newsvendor experiments is not consistent with expected profit maximization—an assumption that is often made in operations management literature. Although prospect theory has been established as a popular model of behavioral decision making under uncertainty, it was considered to be inconsistent with observed newsvendor behavior (in particular, the pull-to-center effect) until a recent study by Long and Nasiry [Long X, Nasiry J (2015) Prospect theory explains newsvendor behavior: The role of reference points. Management Sci. 61(12):3009–3012.] proposed a prospect theory model that is consistent with the pull-to-center effect; however, this model’s ability in representing newsvendor behavior compared to other plausible prospect theory models is unexplored in the literature. This paper takes a more comprehensive approach in building several prospect theory-based newsvendor models and evaluates their competence in representing the observed newsvendor behavior. An important feature of these models is that they are not only consistent with the pull-to-center effect, but they can also, in accordance with the findings from recent research, accommodate individual-level heterogeneity in order quantities. Academic/practical relevance: Designing effective supply chain processes and inventory systems requires that the underlying models represent the observed newsvendor behavior reasonably well, especially in settings where most decisions are made by individuals. Our paper provides a rigorous basis for choosing a model when characterizing the decision making process of a newsvendor. Moreover, our novel approach to model building and testing could serve as a template for selecting appropriate prospect theory models in contexts other than the newsvendor problem. Methodology: Motivated by different types of reference points studied in the decision theory literature, we first build several newsvendor models that can theoretically accommodate individual-level heterogeneity in order quantities. Thereafter, using a multipronged approach based on theoretical criteria, goodness of fit, and empirical validity, we evaluate these models to determine the most appropriate model. Results: The model with mean demand as the stochastic reference point consistently outperforms other models, reducing the prediction error by as much as 31% on the experimental data used for this study. Moreover, all the empirical regularities considered in our paper are consistent only with this model. This suggests that mean demand is more likely to be adopted by experimental subjects as a reference point—perhaps because of its greater salience than the other plausible reference points considered. Managerial implications: Since decisions are made predominantly by human retailers in the emerging markets, we represent their behavior by the model with mean demand as the reference point and identify settings in which they could benefit from investing in decision support systems. We also demonstrate the benefits to a supplier from approximating his retailers’ behavior with this model relative to him using the other prospect theory models considered in this paper.