Load balance is an important issue in datacenter networks. The flowlet-based algorithms can balance the traffic with fine granularity and does not suffer the packet mis-sequencing problem. But their performances are rather limited or require extra communication overhead. In this paper, we propose a local load-aware algorithm called Dynamic Roulette Wheel (DRW). In DRW, the roulette wheel is adopted to select a new path for the flowlet according to the local load. Each source of multipath balances the traffic to all its egress links without the communication overhead. Moreover, the granularity of flowlet can be dynamically tuned from a single packet to the whole flow. Finally, the Capacity Aggregation (CA) mechanism is designed for the case of link or switch failure. We prove in theory that DRW can achieve the optimal global load balancing. The simulation results also show that DRW provides almost the best delay performance and the least packet out-of-order proportion overall among all existing flowlet switching algorithms.