To meet the explosive growth of mobile traffic, the 5G network is designed to be flexible and support multi-access edge computing (MEC), thereby improving the end-to-end quality of service (QoS). In particular, 5G network slicing, which allows a physical infrastructure to split into multiple logical networks, keeps the balance of network resource allocation among different service types with on-demand resource requests. However, achieving effective resource allocation across the end-to-end network is difficult due to the dynamic characteristics of slicing requests such as uncertain real-time resource demand and heterogeneous requirements. In this paper, we develop a reinforcement learning (RL)-based dynamic resource allocation framework for end-to-end network slicing with heterogeneous requirements in multi-layer MEC environments. We first design a hierarchical MEC architecture and formulate a resource allocation problem for the end-to-end network slicing as an optimization problem using the Markov decision process (MDP). Using proximal policy optimization (PPO), we develop independently-collaborative and jointly-collaborative dynamic resource allocation algorithms to maximize resource efficiency while satisfying the QoS of slices. Experimental results show that the proposed algorithms can recognize the characteristics of slice requests and coming resource demands and efficiently allocate resources with a high QoS satisfaction rate.