With the combination of Mobile Edge Computing (MEC) and the next generation cellular networks, computation requests from end devices can be offloaded promptly and accurately by edge servers equipped on Base Stations (BSs). However, due to the densified heterogeneous deployment of BSs, the end device may be covered by more than one BS, which brings new challenges for offloading decision, that is whether and where to offload computing tasks for low latency and energy cost. This paper formulates a multi-user-to-multi-servers (MUMS) edge computing problem in ultra-dense cellular networks. The MUMS problem is divided and conquered by two phases, which are server selection and offloading decision. For the server selection phases, mobile users are grouped to one BS considering both physical distance and workload. After the grouping, the original problem is divided into parallel multi-user-to-one-server offloading decision subproblems. To get fast and near-optimal solutions for these subproblems, a distributed offloading strategy based on a binary-coded genetic algorithm is designed to get an adaptive offloading decision. Convergence analysis of the genetic algorithm is given and extensive simulations show that the proposed strategy significantly reduces the average latency and energy consumption of mobile devices. Compared with the state-of-the-art offloading researches, our strategy reduces the average delay by 56% and total energy consumption by 14% in the ultra-dense cellular networks.