In this article, we address the online resource allocation problem in service queuing systems under uncertainty. In particular, the optimal control policy is derived by using the real-time data, i.e., without fully knowing information about the system state. We employ the conditional value-at-risk to achieve the minimum long-run average cost subject to the constraints on the risk of instability. Then, we can ensure the service ability of the system at a high level under such on-the-fly uncertainty. We show that the proposed model naturally leads to a minmax saddle point optimization problem. We first present an intuitive offline primal–dual learning method, which can achieve a desirable convergence rate. Then, we further improve the algorithm by learning the optimal Lagrange multiplier concerning the instantaneous system state, i.e., online primal–dual learning algorithm, to achieve the same convergence rate as the offline algorithm. Besides, we demonstrate that the proposed method outperforms classical models in providing the lower average delay under uncertainty, which means it can improve the stability and provide a better quality of service under the ambiguity of uncertainty. A real case of inpatient bed allocation for hospital operations is presented to show the performance in applications. The results show that our model can reduce the average wait with certain total available beds under uncertain arrivals.