Abstract

Autonomous racing has lately gained popularity because of its entertainment value and potential of advancing autonomous driving in high-speed situations. These high-speed racing efforts usually focus on a road domain with fixed dynamics. They cannot meet the challenge of policy adaptation between domains with large dynamics gaps. Meanwhile, existing policy adaptation methods either rely on experts to build new environments for policy training, or only handle a small dynamics gap for low-speed control tasks due to limited dynamics modeling and rigorous data collection assumptions. To overcome these drawbacks, we introduce DAARL, a novel policy adaptation algorithm that uses adversarial and reinforcement learning to bridge the large dynamics gap between different domains. It has two training stages. In the first training stage, a domain transfer function is learned by adversarial learning to better capture the dynamics gap. The single domain transfer function integrates with the source domain to implement the dynamics of different target domains virtually without the help of experts. We name these virtual domains the imaginary target domains. In the second training stage, the knowledge of the source-domain policy guides the reinforcement learning of a target-domain policy on an imaginary target domain. It improves the convergence of the target-domain policy. Five experiments have been conducted on a racing simulator with different road domains. All results show that DAARL outperforms baselines in terms of driving speed, stability, success rate, and domain scalability.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call