ABSTRACTApples are one of the most highly‐valued specialty crops in the United States. Recent labor shortages have made crop production difficult for fruit growers, including the task of green fruit thinning. Current methods including hand, chemical, and mechanical thinning impose tradeoffs between selectivity, cost, tree damage, and speed. A robotic green fruit thinning system could potentially selectively thin fruit in a quick, cost‐effective, and non‐damaging manner. The machine vision system would be a critical component for robotic thinning, and would not only need to perform green fruit detection/segmentation, but also fruit‐stem pairing and clustering to facilitate proper decision‐making for thinning. A neural network‐based fruit and stem pairing algorithm was devised and evaluated; an LSTM‐based clustering algorithm was devised and compared to the density‐based clustering algorithm, OPTICS. The algorithms were evaluated on an image data set consisting of GoldRush, Fuji, and Golden Delicious cultivars at the green fruit stage, with evaluations on overall performance, cultivar‐wise performance, cluster size‐specific performance, and feature importance. For fruit and stem pairing, the neural network‐based pairing algorithm achieved an AP of 81.4% on all fruits and stems, and that reached 90.6% when only fruits and stems with labeled angles were considered. For green fruit clustering, the LSTM‐based clustering achieved a clustering success rate of 68.4%, whereas the OPTICS algorithm obtained 50.9%. The algorithms will be further implemented in a pipeline of a future green fruit thinning vision system, as well as integrate the use of point clouds and other 3D orchard information to improve pairing and clustering performance.