The rapid development of intelligent transportation system (ITS) has brought great convenience to people's life, and also makes various intelligent vehicle services emerge in an endless stream. However, with more than 1.35 million people killed in traffic accidents every year, the detection of abnormal driving behavior has attracted more and more attention. Although some progress has been made in the research of abnormal driving behavior detection, numerous challenges remain. Firstly, it is difficult to manually analyze whether driving behavior is abnormal, which makes it costly to accurately label them for machine learning. Secondly, few studies have considered the problem of class imbalance in the abnormal driving behavior detection, which is common in driving behavior and has a significant negative impact on the detection model, especially in weakly supervised scenarios. Thirdly, assessing a driver's behavior solely based on real-time performance is insufficient, the historical behavior is equally important. To address these challenges, an abnormal driving behavior detection system based on weakly supervised learning under edge computing architecture is proposed in this paper. This system considers the class imbalance problem and utilizes the historical information of the vehicles to provide a low cost and effective detection service. Firstly, the abnormal driving behavior detection is modeled as a multi-instance learning (MIL) problem, which only needs coarse-grained labels for anomaly detection, so that the labeling work becomes low cost. Secondly, the class imbalance problem under weakly supervised learning is solved by incorporating parameter transfer and pre-estimating abnormal scores. This approach enhances the model's sensitivity to rare classes and significantly improves the performance of weakly supervised detection models. Thirdly, in order to learn the historical information of the vehicle, a summary mechanism is used for the road side units (RSUs), which can provide different historical summary of the vehicles for other detection nodes, so as to make the model more effective for different abnormal events. Several experiments are carried out on both simulation dataset and real-world dataset to verify the performance of the proposed method. The results demonstrate that compared with the traditional supervised learning methods, the proposed method can still achieve near-optimal performance when only using coarse-grained labels, and all the strategies proposed have improved the detection performance obviously.