This study presents in-silico design and verification of an advanced multi-agent reinforcement learning (RL) strategy for personalized glucose regulation in individuals diagnosed with type 1 diabetes (T1D). The proposed framework realizes a closed-loop system, encompassing a blood glucose (BG) metabolic model acting as virtual patient, and a sophisticated multi-agent soft actor–critic RL controller serving as the basal-bolus advisor. The efficacy of the RL agents in meeting the recommended glycemic targets is benchmarked against traditional therapeutic approaches and rigorously assessed across three distinct scenarios: (A) nominal, three meals per day, (B) robustness against meal disturbances, and (C) robustness against insulin sensitivity disturbances. Key evaluation metrics are minimum, maximum and mean values of glucose concentrations, the temporal distribution across various BG intervals, and the daily averages of both bolus and basal insulin administrations. Empirical results underscore the superior performance of the RL-driven basal-bolus advisor, manifesting in diminished glycemic fluctuations and an augmented duration within the optimal range of 70–180 mg/dL. Specifically, in scenarios A, B, and C, the time within this desired range witnessed a surge from 66.66±34.97% to 92.55±4.05%, 64.13±33.84% to 93.91±6.03%, and 58.85±34.67% to 78.34±13.28%, respectively. The RL paradigm also exhibited a marked proficiency in curtailing severe hyperglycemia incidents (p ≤0.05) and diminishing hypoglycemia occurrences. In scenarios A and B, hypoglycemic incidents were curtailed from 14.2% ±32.27% to 3.77% ±4.01% and 16.59% ±32.42% to 2.63% ±4.09%, respectively. Intriguingly, scenario C recorded zero hypoglycemic incidents for both methodologies, attributed to a decline in insulin sensitivity. Additionally, a statistically significant decrement in the mean daily basal insulin dosage was observed with the RL agent in comparison to traditional therapy (p ≤0.05). Collectively, the outcomes accentuate the potential of the multi-agent RL strategy in enhancing glucose regulation and minimizing the peril of severe hyperglycemia for T1D patients.