Cooperative candidate relay sets (CRS) in opportunistic broadcast paradigms can significantly improve the performance efficiency of traditional broadcast schemes by simultaneously scheduling multiple vehicles in the data packet transmission process. Nonetheless, multiple relay synchronization may cause enormous data redundancy and high delays in data delivery, resulting in a high cooperative transmission cost. High vehicle velocities and vehicular topology restrictions, including intersections, exacerbate the cooperative transmission cost. Motivated by a trade-off between delay and delivery ratio via reducing duplicate data packets and redundant relays, we address this gap in this paper. To this end, we propose an adaptive candidate relay set (ACRS) size scheme based on the Neutrosophic Set Analytic Hierarchy Process (NS-AHP) technique assisted by a Q-learning algorithm. The NS-AHP pair-wise comparison matrix includes; redundancy, cooperative-based efficiency, and relay set stability as the decision-making attributes. Meanwhile, we use a hierarchical Q-learning approach based on the time-varying vehicular environment to adjust the relative preference of the attributes and alternatives (i.e., various CRS sizes) over each other. To optimally learn the relative preference values of the attributes and alternatives over each other, reward functions are set dynamically at both levels. Comprehensive simulations via NS-2 and SUMO are carried out to assess the efficacy of our ACRS scheme compared to three commonly used baseline techniques of fixed size, threshold-based, and probability-based techniques. Overall, the average yields obtained from the experimental outcomes in the urban and highway scenarios show that the ACRS scheme, on average, improves the packet delivery ratio by 3.77%, 5.36%, and 1.35%; reduces the single-hop delay by 34.78%, 42.085%, and 19.39%; reduces the end-to-end delay by 34.76%, 42.72%, and 21.35%; improves the transmission throughput by 34.24%, 41.54%, and 19.305%; and reduces the redundancy ratio by 43.69%, 50.66%, and 24.32% compared to the fixed size, threshold-based, and probability-based techniques, respectively.