With the rapid pace of urbanization, a significant number of rural laborers are migrating to cities, leading to a severe shortage of agricultural labor. Consequently, the modernization of agriculture has become a priority. Autonomous picking robots represent a crucial component of agricultural technological innovation, and their development drives progress across the entire agricultural sector. This paper reviews the current state of research on fruit- and vegetable-picking robots, focusing on key aspects such as the vision system sensors, target detection, localization, and the design of end-effectors. Commonly used target recognition algorithms, including image segmentation and deep learning-based neural networks, are introduced. The challenges of target recognition and localization in complex environments, such as those caused by branch and leaf obstruction, fruit overlap, and oscillation in natural settings, are analyzed. Additionally, the characteristics of the three main types of end-effectors—clamping, suction, and cutting—are discussed, along with an analysis of the advantages and disadvantages of each design. The limitations of current agricultural picking robots are summarized, taking into account the complexity of operation, research and development costs, as well as the efficiency and speed of picking. Finally, the paper offers a perspective on the future of picking robots, addressing aspects such as environmental adaptability, functional diversity, innovation and technological convergence, as well as policy and farm management.