Objective Autonomous vehicles (AVs) have the potential to revolutionize the future of mobility by significantly improving traffic safety. This study presents a novel method for validating the safety performance of AVs in high-risk scenarios involving powered 2-wheelers (PTWs). By generating high-risk scenarios using in-depth crash data, this study is devoted to addressing the challenge of public road scenarios in testing, which often lack the necessary complexity and risk to effectively evaluate the capabilities of AVs in high-risk situations. Method Our approach employs a Wasserstein generative adversarial network (WGAN) to generate high-risk scenes, particularly focusing on PTW scenarios. By extracting 314 car-to-PTW crashes from the China In-depth Mobility Safety Study–Traffic Accident database, we simulate outcomes using PC-Crash software. The data are divided into scenes at 0.1-s intervals, with WGAN generating numerous high-risk scenes. By using a cumulative distribution function (CDF), we sampled and analyzed the vehicle’s dynamic information to generate complete scenarios applicable to the test. The validation process involves using the SVL Simulator and the Baidu Apollo joint simulation platform to evaluate the AV’s driving behavior and interactions with PTWs. Results This study evaluates model generation results by comparing distributions using Wasserstein distance as an indicator. The generator converges after approximately 200 epochs, with the iterator converging quickly. Subsequently, 10,000 new scenes are then generated. The distribution of several key parameters in the generated scenes can be found to approximate that of the original scenes. After sampling, the usability of generated scenarios is 64.76%. Virtual simulations confirm the effectiveness of the scenario generation method, with a generated scenario crash rate of 16.50% closely reflecting the original rate of 15.0%, showcasing the method’s capacity to produce realistic and hazardous scenarios. Conclusions The experimental results suggest that these scenarios exhibit a level of risk similar to the original crashes and are effective for testing AVs. Consequently, the generated scenarios enhance the diversity of the scenario library and accelerate the overall testing process of AVs.