The true slime mold Physarum polycephalum, a single-celled amoeboid organism, is capable of efficiently allocating a constant amount of intracellular resource to its pseudopod-like branches that best fit the environment where dynamic light stimuli are applied. Inspired by the resource allocation process, the authors formulated a concurrent search algorithm, called the Tug-of-War (TOW) model, for maximizing the profit in the multi-armed Bandit Problem (BP). A player (gambler) of the BP should decide as quickly and accurately as possible which slot machine to invest in out of the N machines and faces an “exploration–exploitation dilemma.” The dilemma is a trade-off between the speed and accuracy of the decision making that are conflicted objectives. The TOW model maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a nonlocal correlation among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). Owing to this nonlocal correlation, the TOW model can efficiently manage the dilemma. In this study, we extend the TOW model to apply it to a stretched variant of BP, the Extended Bandit Problem (EBP), which is a problem of selecting the best M-tuple of the N machines. We demonstrate that the extended TOW model exhibits better performances for 2-tuple-3-machine and 2-tuple-4-machine instances of EBP compared with the extended versions of well-known algorithms for BP, the ϵ-Greedy and SoftMax algorithms, particularly in terms of its short-term decision-making capability that is essential for the survival of the amoeba in a hostile environment.