The rapid advancement in sensors and communications has led to the expansion of the Internet of Things (IoT) services, where many devices need access to the transport network using fixed or wireless access technologies and mobile Radio Access Networks (RAN). However, supporting IoT in RAN is challenging as IoT services may produce many short and variable sessions, impacting the performance of mobile users sharing the same RAN. To address this issue, network slicing is a promising solution to support heterogeneous service segments sharing the same RAN, which is a crucial requirement of the upcoming fifth-generation (5G) mobile network. This paper proposes a two-level network slicing mechanism for enhanced mobile broadband (eMBB) and Ultra-Reliable and Low Latency communications (URLLC) in order to provide end-to-end slicing at the core and edge of the network with the aim of reducing latency for IoT services and mobile users sharing the same core and RAN using the O-RAN architecture. The problem is modeled at both levels as a Markov decision process (MDP) and solved using hierarchical reinforcement learning. At a high level, an SDN controller using an agent that has been trained by a Double Deep Q-network (DDQN) allocates radio resources to gNodeBs (next-generation NodeB, a 5G base station) based on the requirements of eMBB and URLLC services. At a low level, each gNodeB using an agent that has been trained by a DDQN allocates its pre-allocated resources to its end-users. The proposed approach has been demonstrated and validated through a real testbed. Notably, it surpasses the prevalent approaches in terms of end-to-end latency.