AbstractWith the aging population and increasing labor costs, traditional manual harvesting methods have become less economically efficient. Consequently, research into fully automated harvesting using selective harvesting robots for cherry tomatoes has become a hot topic. However, most of the current research is focused on individual harvesting of large tomatoes, and there is less research on the development of complete systems for harvesting cherry tomatoes in clusters. The purpose of this study is to develop a harvesting robot system capable of picking tomato clusters by cutting their fruit‐bearing pedicels and to evaluate the robot prototype in real greenhouse environments. First, to enhance the grasping stability, a novel end‐effector was designed. This end‐effector utilizes a cam mechanism to achieve asynchronous actions of cutting and grasping with only one power source. Subsequently, a visual perception system was developed to locate the cutting points of the pedicels. This system is divided into two parts: rough positioning of the fruits in the far‐range view and accurate positioning of the cutting points of the pedicels in the close‐range view. Furthermore, it possesses the capability to adaptively infer the approaching pose of the end‐effector based on point cloud features extracted from fruit‐bearing pedicels and stems. Finally, a prototype of the tomato‐harvesting robot was assembled for field trials. The test results demonstrate that in tomato clusters with unobstructed pedicels, the localization success rates for the cutting points were 88.5% and 83.7% in the two greenhouses, respectively, while the harvesting success rates reached 57.7% and 55.4%, respectively. The average cycle time to harvest a tomato cluster was 24 s. The experimental results prove the potential for commercial application of the developed tomato‐harvesting robot and through the analysis of failure cases, discuss directions for future work.