Automation of grapevine agricultural tasks, e.g., harvesting, requires reliable methods for detecting the exact cutting points of the grape bunches. Dynamically changing vineyard environments, differences between plant varieties, illumination, occlusion, color similarities, and varying contrast make the detection of the grapes’ stems in unstructured environments difficult. In this work, a grape stem detection methodology in images is proposed, towards introducing an autonomous grape harvesting robot (ARG), as an affordable and consistent alternative to the time-consuming specialized work of an experienced harvester. For this purpose, a regression convolutional neural network (RegCNN) is applied for executing a stem segmentation task. Twelve Convolutional Neural Network (CNN) model architectures derived by the combination of three different feature learning sub-networks with four meta-architectures, are investigated. For the first time, stem detection is tackled as a regression problem in a way to alleviate the imbalanced data phenomenon that may occur in vineyard images. In order to justify the effectiveness of the RegCNN models, the same CNN architectures are tested in a typical classification (ClaCNN) setup. Comparative results involving two datasets with different characteristics reveal that the regression models outperform the classification ones. Grape bunches stems are detected with an Intersection-over-Union (IU) performance of up to 98.18% with RegCNNs, before post-processing optimization. Moreover, by applying a Genetic Algorithm (GA)-based parameter tuning mechanism, optimized post-processing parameters lead to an improved IU accuracy of up to 98.90% for the UNET_MOBILENETV2 model with acceptable real-time performance. Compared to other similar methodologies, the proposed method provides higher correct stem detection rates in unconstrained and highly changing environments, e.g., vineyards, and thus it is appropriate for robust real-time stem identification towards facilitating the agricultural tasks executed by a robot harvester.