Attempting to determine the basis of expertise by seeing what experts have in common is a flawed research strategy. The strategy is attractive, to be sure, and we have not been immune to its charms (Shanteau, 1988). The problem is that we cannot infer that those commonalities are unique to experts. Nonexperts may also share the identified characteristics. The data gleaned from examining experts can at best suggest provocative hypotheses. The flaw is not merely a logical nicety; it cripples the attempt to attribute causality. Whether practice is deliberate or informal, and whether it lasts for 10years, 10000hours, or a vague long time do not matter. Even if we accepted EP’s (the expert performance approach) evidence supporting the idea that no one who does not meet the deliberate practice criterion becomes an expert, the justified conclusion would only be that practice is necessary for expertise. Whether practice is sufficient for expertise can be determined only by showing that everyone who meets the criterion attains that status. The selection process imposed by studying those already deemed expert rules out the possibility of establishing sufficiency. A crucial distinction between PBA (our performancebased assessment approach) and EP is what is meant by performance. For us, performance refers to behavior, what the contender actually does. Discrimination and consistency are observable properties of behavior. For EP, sometimes the focus is on behavior, but more often, it is on outcomes, with results attributed to the contender’s behavior. Running speed and chess moves are behavioral measures. Those EP classics are far removed from the indirect assessments proposed in Ericsson’s (2014) Table 1. Consider the study in which we analyzed priority ratings for occupational therapy. Ericsson suggests assessing the rater’s skill by following the applicants’ lives with and without therapy. In order to do that, the analyst must quantify the quality of life for each applicant. It is not obvious how to do that objectively or even subjectively. An associated question is when assessment should take place; should it be taken after the therapy period, or after a fixed number of years, or after death? Contrasting quality of life for people who did or did not get therapy is also problematic, because assignment would have been based on the rater’s recommendation. Those who did not get therapy might have been judged either too healthy to need it or too impaired to benefit from it. There is also a more fundamental problem built into EP’s correspondence (to outcomes) approach. It is just possible that quality of life could be affected by circumstances and events unrelated to whether or not applicants received therapy. The variability introduced by such confounding threatens the integrity of between-groups comparisons. A goal of EP is to understand processes employed by experts. PBA does not dispute that ambition. Rather, we focus on the prior step, establishing expertise empirically. Experimental control allows valid comparisons across contenders, thereby allowing us to answer the crucial question ‘whose behavior exhibits more expertise? ’‘ Fictitious’ stimuli enhance experimental control and can facilitate exploration of how stimulus information influences the judgments.