Two Forum articles and an editorial in 2021 called for a rethink of how operations management (OM) scholars conceptualize the Toyota Production System (TPS) and Lean (the Western label given to certain elements of the TPS). In the lead article in that series, Hopp and Spearman (2021, pp. 10 and 11) observed that the evolution of Lean from a physics of flows to an organizational culture that supports “continual reduction of the cost of waste” requires us “to incorporate human behavior more scientifically.” They noted that “A more extensive, and largely untapped, resource is the wide array of cognitive research into heuristics and biases that has been developed by behavioral and decision scientists since the 1970s.” This brings to mind the description by Fujimoto (1999) of the TPS as a knowledge-management system, in contrast to the common understanding of the TPS (captured by the designation “Lean”) as buffer management. In this editorial, we continue the discussion started by Hopp and Spearman with a thought experiment in which we consider TPS practices as heuristics. An initial objective was to contribute to disentangling the TPS knowledge- and buffer-management roles, asking: Are buffer-management tools designed to support knowledge management, or do knowledge-management TPS tools exist to allow operations to run as lean as possible (i.e., manage buffers efficiently)? The heuristics lens revealed the mechanisms by which buffer removal can be used to create cues from the production environment that effectively inform decision making. More generally, we discovered that the exercise of interpreting TPS practices as heuristics provided insight into whether and how heuristics can contribute to an effective management of operations. We analyzed a sample of common practices that have been observed to be used by Toyota as one approach to implementing the TPS: jidoka, andon, and kanban. These practices transform front-line employees into decision makers by clearly specifying the information to be considered and the decision rule to be followed in a precisely defined situation. The resulting heuristics can be described as “production” heuristics, as their objective is to contribute to the line running smoothly on a day-to-day basis. We then considered practices that Toyota has been observed to use to prepare the environment for the successful deployment of these production-heuristic practices, including, for example, respect for workers, gemba, kaizen, and “five whys”. These “exploration” heuristics are oriented toward problem solving through carving out regularities in what appears to be a chaotic landscape. Whereas the production heuristics use stopping rules to strictly limit the information to be considered and precisely define the decision rule, the exploration heuristics relax the search rules and strongly encourage the decision maker to maintain information in the decision process.i They also allow the goal of the decision process to be flexible. In the production context, humans may make the error of assuming that more information is always better. In the exploration context, humans may make the error of moving forward with a decision based on too little information. Heuristics can help to avoid both types of error: We see TPS practices as either limiting or augmenting the amount of information to be considered, either precisely specifying or explicitly refusing to specify the objective of the decision. In contrast to key performance indicators, kaizen encourages decision makers to think about what it means to make things better. The “five whys” instruct decision makers to keep asking questions even though they think they already know the answer. We will present examples in which TPS performance was decreased by failing to maintain these systems that cause exploration heuristics to avoid premature elimination of information and flexibility. Although conventional wisdom considers heuristics as always dramatically reducing information-in-use, our exploration of the TPS reveals that heuristics may direct decision makers to reduce or expand that information. TPS success can possibly be attributed in part to deploying heuristics that are designed to either produce efficiently or explore, with exploration heuristics creating an environment in which the production heuristics function well. Gigerenzer et al. (1999) proposed a typology of heuristics that first divides “reasonableness” (rational decision making) according to whether rationality is bounded or unbounded (Simon, 1955). Bounded rationality—which underlies essentially all business decisions—requires the decision maker to reduce the information considered, along the lines that Savage (1954) described as a small world. Decisions made under bounded rationality set as their objective to satisfice (making a decision that is good enough, Simon, 1956) rather than optimize. Heuristics—the decision rules used in satisficing—can be more or less “ecologically rational,” that is, can vary in their ability to produce decisions that qualify as rational while requiring little in terms of data and computational capacity. Gigerenzer and Gaissmaier (2011, p. 454) defined a heuristic as “… a strategy that ignores part of the information, with the goal of making decisions more quickly, frugally, and/or accurately than more complex methods.” Rationality remains bounded for ecologically rational heuristics. Ecologically rational heuristics—designated as “fast and frugal” by Gigerenzer, Todd and the ABC Research Group (1999)—have been observed to go beyond mere satisficing, sometimes performing as well as or better than optimization that uses considerably more data. Fast and frugal heuristics are exemplified by the gaze heuristic: a simple interception rule that can be used by athletes to catch balls when playing sports, by animals to hunt down prey, and suggested as a contributor to the Royal Air Force's victory over the German Luftwaffe in World War II (Gigerenzer, 2007; Hamlin, 2017). It may also have played a role in US Airways Flight 1549's spectacular life-saving water landing in the Hudson River in 2009 (e.g., Hafenbrädl et al., 2016). This heuristic considers only the angle of gaze (a single piece of information) and involves no mathematical analysis. “Fast-and-frugal trees” (e.g., Martignon et al., 2008) have been used in contexts such as medical, judicial, and military (e.g., Katsikopoulos et al., 2021). The “take-the-best” heuristic (Gigerenzer & Goldstein, 1996)—a lexicographic strategy for inference—has been observed to outperform extensive data analysis (Czerlinski et al., 1999; Gigerenzer & Brighton, 2009). In OM, Bendoly (2020) classified as fast and frugal the nearest-neighbor sequencing heuristic used in logistics, also heuristics used in project management that minimize either slack or processing time in assigning resources. He uses these examples to illustrate how restricting the information considered can yield a reasonably good decision that is easily determined. Not all heuristics are fast and frugal. Heuristics are simple decision-making strategies that typically ignore much of the information that is potentially available. When that information turns out to be essential to making a good decision, not considering it may well produce irrational decisions, many of which can be attributed to a variety of biases. Hopp and Spearman cite hindsight, confirmation, and loss aversion as examples of bias in the context of Lean production. (see Eckerd & Bendoly, 2015, for an in-depth discussion of these biases in OM). Gray et al. (2017) identified a heuristic that they entitled “lowest per-unit landed-cost” in which the production-location decision was based on a single factor. Limiting the offshoring decision to this single factor—ignoring readily available information that would have brought to light offshoring risks—resulted in quality problems, loss of intellectual property, and other unexpected management problems that were sufficiently severe that the offshoring decisions were reversed and production reshored. In other words, basing the decision on a single factor may work well (as exemplified by the gaze heuristic), but—as in this case—may lead to bias. Furthermore, when the (biased) offshoring decision was reversed, it was done based on extensive data collection and analysis, closer to what Kahneman et al. (1982) referred to as System 2 thinking than to a heuristic. No effort was observed by Gray et al. to enlarge the production-location decision process that had led to the original decision to offshore based solely on minimizing the per-unit landed cost. The firms studied may thus be vulnerable to repeating the original myopic and biased decision. This example illustrates the potential for a heuristic to perform poorly in its current environment. Heuristics are typically described as a statistical adaptation to a given context (as occurs in the development of some fast-and-frugal trees used in areas like medicine) or as a performance rule that indicates which activities to prioritize or delay (Browning & Yassine, 2016). They can be taught, learned by observation, and discovered via experimentation and trial and error. Eckerd and Bendoly (2015, p. 5) referred to the tendency in the field of behavioral operations to consider cognitive limitations of individuals as yielding flawed mental models, contrasting that view with the more nuanced one by Katsikopoulos and Gigerenzer (2013) that heuristics may be either an asset (ecologically rational) or a liability (biased) when used in decision making. The fast-and-frugal heuristics research program (e.g., Gigerenzer et al., 2011) has contributed to the identification of heuristics present in decision making, and the characterization of what makes these heuristics fit and perform well in a particular decision environment. When a heuristic in use is observed to produce biased decisions, is debiasing better achieved by moving from the heuristic to a fuller analysis done by the decision maker (i.e., moving toward constrained optimization along the lines suggested by Little (1970)), by recalibrating the heuristic to improve its performance, or by moving between a production and an exploration heuristic? SIMON (1955, 1956) developed the idea of bounded rationality, introducing the idea of satisficing as an alternative to optimizing. To economists, Simon (1955) emphasized the cognitive capacity of decision makers, and to psychologists (Simon, 1956) the environment (see Petracca, 2021, for a discussion of this division). Simon (1990, p. 7) brought these two sides together as he wrote, “Human rational behavior (and the rational behavior of all physical symbol systems) is shaped by a scissors whose two blades are the structure of task environments and the computational capabilities of the actor.” Heuristics are rules of thumb that can typically be decomposed into search, stopping, and decision rules. The ecological rationality of a given heuristic depends on how well these rules allow the “scissor blades” formed by the task environment and computational capacity of the actor to operate together. Production heuristics typically emphasize search limitation, whereas exploration heuristics may encourage maintaining more information in the decision-making process: A well-functioning heuristic may increase or decrease the information in use depending on the context. The description by Daft and Weick (1984, p. 289) of organizations as “interpretation systems” provides useful input to understanding the interactions of the search (environment) and stopping (cognitive capacity) dimensions of heuristics. Daft and Weick defined interpretation modes in terms of (1) whether or not the goal of interpretation is to identify a right answer that is assumed to exist, which determines whether or not the environment is seen as “analyzable,” and (2) whether the organization relates to the environment intrusively or passively. The resulting 2 × 2 matrix is reproduced in Table 1. UNDIRECTED VIEWING Interpretation: Constrained. Data: Nonroutine, informal, arising from hunch, rumor, chance opportunities. ENACTING Experimentation, testing, coercion, invent environment. Learn by doing. CONDITIONED VIEWING Interpretation: Within traditional boundaries. Detection: Passive. Data: Routine, formal. DISCOVERING Formal search. Questioning, surveys, data gathering. Active detection. While there is potential overlap between analyzable and well-structured problems, one can imagine a problem that would be reasonably well structured according to Simon's characteristics but would not count as analyzable. Daft and Weick (1984) used the game of 20 questions to introduce the notion of analyzable. When played in the normal sense, those asking the questions are seeking a right answer that they expect to exist, consistent with an analyzable environment. If the answer changes without notice in mid-game, the environment becomes unanalyzable: There is no longer a unique right answer, and the method of inquiry in use will need to expand to incorporate exploration of what is meant by a right answer in order to produce a reasonable outcome. We could argue that this unannounced change in right answer could be captured by Simon's characterizations, so could be argued to remain a reasonably well-structured problem, while transitioning from analyzable to unanalyzable in terms of Daft's and Weick's reasoning. The same holds for Savage's (1954) distinction between large or small worlds. It may be possible to improve analyzability by reducing the size of the world under consideration, but we can also imagine a small world that is unanalyzable and a large world that is analyzable. How would we classify the change of correct answer in the game of 20 questions used by Daft and Weick to exemplify unanalyzability? At first glance, the change in answer looks like it could correspond to De Meyer et al.'s (2002) chaos category. But, further consideration yields that the change in right answer could be part of the game, and we could imagine different possibilities for knowableness across different contexts. If the game player is advised that this kind of shift might occur and given the probability, then the situation is categorized as known—yet would still count as unanalyzable in terms of Daft's and Weick's characterization of the organization as an interpretation system. Use of a production heuristic is consistent with assuming that the environment is analyzable and that there is a right answer. Let us return to the example from Gray et al. (2017) in which use of a single-reason decision tool caused important information to be left out of the decision. The set of states of nature that needed to be considered in making the production-location decision was much larger than what was actually considered: The “world” in use was too small and not well calibrated to the information that needed to be considered. These companies may have benefited from encouraging the use of exploration heuristics for the production-location decision to allow them to address—and possibly somewhat tame—the unanalyzability of their sourcing environment, moving their interpretation mode from what Daft and Weick referred to as undirected viewing (“informal, arising from hunch, rumor, chance opportunities”) to their concept of enacting (“experimentation, testing, coercion, invent environment… Learn by doing”; Daft & Weick, 1984, p. 289) or to what Browning and Ramasesh (2015) called directed recognition. We explore an approach to building a heuristic that can function effectively in an unanalyzable environment in the following section. Feduzi et al. (2022) compared disconfirmation and counterfactual reasoning as methods of inquiry: Disconfirmation starts from the hypothesis that the environment is as assumed and counterfactual reasoning from the hypothesis that the environment is not as assumed. A decision maker using disconfirmation as a method of inquiry will demonstrate confidence in the decision rule in use, and will continue to use it until strong evidence of its being incomplete or incorrect emerges. A decision maker using counterfactual reasoning will question the outcome of the decision rule in use, also asking questions about how the decision changes if the decision rule is incorrect or incomplete. As the environment becomes less analyzable—so the range of outcomes and conditions that should be considered increases—Feduzi et al. (2022) argued that counterfactual reasoning provides more protection against confirmation bias than does disconfirmation. For situations in which the environment lends itself to the assumption that it is analyzable—that there is a right answer and seeking to find it makes sense—the disconfirmation inherent in the use of a heuristic does not produce unreasonable bias, and a heuristic may perform well. When, however, the environment cannot be reasonably assumed to be analyzable and where the goal is to enact the environment rather than accept it as given, then a production heuristic based on disconfirmation can produce a decision that lacks ecological rationality. This emerges clearly in the production-location decisions described by Gray et al. (2017). Six offshoring decisions had to be reversed at considerable cost, suggesting that these decisions lacked rationality. Avoiding this expensive reshoring required broadening the decision rule to consider factors, like intellectual property protection and quality assurance, that could go wrong under the single-reason decision rule that only considered the per-unit cost. The assumption that the environment was sufficiently analyzable to be reduced to a single factor was thus not reasonable. Exploration heuristics encourage the decision maker to collect more information than would be intuitive: Moving from disconfirmation to counterfactual reasoning while maintaining clear and structured decision rules may permit the decision maker to make ecologically rational decisions while not prematurely freezing the question. Instead of the right answer being to minimize per-unit landed cost, the problems arising suggested that a better right answer should include factors like quality and intellectual property: This possible change in right answer indicates some level of unanalyzability. Encouraging the decision makers observed by Gray et al. (2017) to engage in counterfactual reasoning may have created space for them to incorporate readily available data on things that were likely to cause problems. And, over time the set of factors to be considered could be organized into a tree structure that would be sufficiently comprehensive to form the basis of a heuristic that would be ecologically rational, fast, and frugal—and disconfirmation would again become an appropriate method of inquiry. This calibration process would also provide advantages for future decisions, avoiding the potential for oscillation described by Gray et al. (2017). Table 2 expands the outcomes when the recommended reasoning matches or differs from what is recommended for a given state of the environment. Disconfirmation in a situation where goals and cues are changing may well produce irrational decisions that may take a long time to be fixed because of the absence of questioning and the challenging of assumptions. One can also observe cases, however, in which disconfirmation is the recommended method of inquiry—but decision makers are difficult to convince that a rational decision can emerge without considering all possible data. The decision as to whether to admit a patient with severe chest pain to a coronary care unit provides a good example of the possible downside of seeking as much as data as possible in the decision process. Gigerenzer (2007) described a Michigan hospital in which 90% of such patients were sent by physicians to the coronary care unit as a defensive decision, potentially leading to overcrowding in the coronary care unit and excess exposure to hospital-transmitted infections. The researchers deployed a complex predictive model to be used with a pocket calculator. Use of the model led to improved decision making among physicians while it reduced crowding in the coronary care unit. The researchers then removed the calculators, and discovered—unexpectedly—that the physicians' decision quality did not decline because use of the model had helped them to identify specific cues that improved decision making. Key cues were then translated into a fast-and-frugal tree that both provided an effective stopping rule, based on tested cues, and made using this decision process acceptable, removing the need for physicians to defensively send the vast majority of patients for maximal care. The fast-and-frugal tree used here was effective because it freed the decision makers from excess data. The stylized optimization model in the above coronary-care example captured the essence of a patient-allocation decision. Decision makers, however, often do not trust complex models and continue to search for possible exceptions—while making ad hoc and costly decisions. In such cases, heuristics may outperform highly accurate but complex models. Little's (1970, p. B-468) observation that managers tend not to use models captures the image of a manager faced with a model that is hard to understand and incomplete, which results in a long search incorporating many runs until the decision maker “screws up enough courage to make a decision.” The transition from analytical model to fast-and-frugal-tree in the coronary-care case illustrates how a complex model can facilitate development and calibration of heuristics: A promising answer to Little's concern. In this section, we have seen that a heuristic may produce ecologically rational decisions in an unanalyzable environment by combining structured decision rules with mechanisms to encourage counterfactual reasoning. In the following section, we explore a sample of TPS practices to observe whether and how these mechanisms function. The TPS emerged from Japan in the late 1970s (e.g., Sugimori et al., 1977) and quickly captured the attention of the entire world. Toyota overcame strong competitive disadvantages (such as the need to transport cars from Japan, and access to considerably fewer resources for research and development) to take market share from companies like General Motors. A key difference between the TPS and mass production concerned how to adjust the upstream production rate to match what the line needed or was able to handle downstream. Consider a downstream problem that caused in-process inventory to accumulate. TPS practices were designed to highlight problems arising so that attention would be directed to solving them, while also matching upstream production to downstream capacity. Rather than setting an objective to perform well with respect to traditional performance indicators like utilization or output, the TPS instead set as an objective to equip the employees encountering a production problem in their immediate environment to avoid inventory buildup and contribute to getting the problem fixed. Mass production, in contrast, was based on a decision rule that called for output maximization irrespective of conditions in the environment. Under mass production, workers were expected to focus on their tasks rather than pay attention to machines that were not functioning correctly. A worker who was not able to complete their assigned tasks during a cycle was obliged to leave the work incomplete, resulting in a unit that did not conform to specifications, while a worker who was able to fill the intermediate buffer between their and an adjacent workstation was viewed as a good performer. We here analyze three TPS tools that were designed to stop production when downstream needs decreased (either because demand was met, or because of a production problem): jidoka (also known as “autonomation,” or automation with a human touch), andon, and kanban. Jidoka combines human intervention with automation. When an automated machine develops a problem (e.g., a piece gets stuck, it runs out of material, or it goes out of alignment), under jidoka, it is designed to stop automatically and signal to a nearby worker that action needs to be taken. When the signal is given, the worker either fixes the problem or notifies maintenance that repair is needed. The key difference with mass production is that the worker is personally involved in getting the problem fixed or getting its existence communicated rather than ignoring the problem to focus on maximizing output at their workstation. Monden (1983) described jidoka as often being used on a process with some degree of automation, but also being used as a concept in a manual process. The andon cord provided at each workstation allowed the worker to flag a production problem and request help from a supervisor: A worker who saw at the 70% mark of the cycle that the work would not be completed would pull the cord and have a supervisor sprint over to provide help. If the supervisor was able to help get the work completed within the cycle, the cord was pulled a second time and the line continued. If, however, the problem was not yet resolved, the line would stop at the end of the cycle. Creation of these line-stoppage decision rules not only avoided assembly of a defective unit, but also provided clear data as to where workers on the assembly line were most likely to be stressed. Monden (1983) considered andon to fall under the general category of jidoka, but it was considered in the West to represent a quite startling departure from normal assembly line operation, both in giving workers the right to stop the line and in workers being willing to admit that they were not keeping up—knowing that this personal performance-related information would be collected and analyzed by management. The kanban system was designed to limit the buildup of inventory between two adjacent workstations. An upstream worker is only allowed to begin production of an item if an unattached kanban is available. An inventory buffer between the two workstations is able to buffer to some degree, such that the effect of temporary slowdowns at one workstation on the adjacent workstation would be minimized. Once there are enough kanbans to buffer temporary slowdowns, adding more kanbans would only serve to increase system waiting time for the pieces in inventory. Toyota went a step further and implemented a system in which the number of kanbans was gradually reduced to draw attention to production-line imbalances. When one workstation would block or starve the other, the blocked or starved workstation would provide feedback that was expected to lead to learning. It is in the kanban system that we see most clearly Toyota's understanding of the relationship between buffer inventory and learning. While inventory buffers were used to smooth flow, there was a constant awareness of the ability of inventory to hide problems and line imbalances, and that careful management of inventory could lead to process improvement (see Suri & de Treville, 1986, for an in-depth discussion of the relationship between the exploratory stress created by reducing this buffer inventory and learning). By extracting their search, stopping, and decision rules, these three tools can be conceptualized as heuristics, as shown in Table 3. They allowed Toyota to deploy the cognitive capacity of its entire workforce toward smoothing production of high-quality products. Not only were the heuristics themselves fast and frugal, but they also brought into use a massive cognitive capacity that tended to be neglected in mass production. Returning to the Daft and Weick (1984) distinction between active and passive interpretation systems, we suggest that the TPS represents active interpretation in contrast to the passive interpretation encouraged by mass production. The assembly line became an analyzable world in which front-line employees could confidently contribute to the company functioning well, because well-calibrated heuristics made clear to them what they were to do where, under which circumstances. There was no need for counterfactual inquiry, because the assumption that the local environment was analyzable yielded ecologically rational decisions: These three tools fit the description of production heuristics. Two observations arise from this analysis. First, these production heuristics performed well when an active interpretation system was combined with an analyzable environment. Consider a front-line employee who observes a problem (e.g., defective raw material, not being able to complete their operation by the end of a cycle, or that their speed is blocking or starving an adjacent workstation). On a traditional line, the employee may observe the problem but is not in a position to take action to resolve it, either in the immediate or longer term. TPS practices enable the employee to take action by pulling the andon cord, reorienting a part correctly, and organizing with the adjacent workstation to rebalance capacity. Thus, one outcome of TPS is to make the employee interact more intrusively or actively with the local environment. Active interpretation then enables the cognitive capacity of the employees to be made available. Establishment of an analyzable, local environment then allows disconfirmation as the dominant method of inquiry in these decisions without risk of confirmation or other bias. Second, these well-calibrated heuristics produced rational and profitable decisions, contributing to a level of performance that continues to astound decades later. Our thought experiment is built around the idea that key TPS practices can be conceptualized around the selection, design, and calibration of heuristics to increase the ecological rationality of the resulting decisions. Our above dis