As service-oriented computing systems become more buoyant and complex, the occurrence of faults dramatically increases. Fault prediction plays a crucial role in the service-oriented computing paradigm, aiming to reduce testing cost while maximizing testing quality to utilize testing resources effectively and increase the reliability of service compositions. Although various fault prediction techniques were considered in software testing, service-oriented systems were less fortunate, in which most of the studies have focused on single web services testing rather than service compositions. Moreover, mainly the detection of faulty/non-faulty services was addressed, ignoring the estimate of faults count, their severity, as well as predicting when and where such faults would occur. In this paper, a multilateral model-based fault prediction and localization approach is proposed using deep learning techniques for web service compositions testing rather than single web service testing, which uniquely predicts not only faulty services, but also their count and severity level, location of faults, and time at which faults would occur. Three deep learning models are investigated: Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs) and a proposed hybrid model based on both CNN and RNN. The proposed approach is language-independent, as it adopts process metrics rather than code metrics to overcome the code unavailability concern of services. The experimental analysis adopted main performance metrics on multiple public datasets to evaluate its efficiency and effectiveness. The results indicated that the hybrid CNN_RNN model achieves an average accuracy range of 84%–95.7%, where the RNN and CNN models individually achieve 75%–90% and 70%-79.3% respectively. Thus, the hybrid model increases the accuracy level by 5%–10% and 15%–20%, while achieving the least mean square error of 30% and 60% compared to the RNN and CNN models respectively. In terms of time, the RNN model consumes less average time as of 30–50 ms for the different datasets of variant sizes compared to the CNN and hybrid CNN_RNN models that consume 79–102 and 177–224 ms respectively. Thus, RNN model consumes around 50%–80% less time than those of the CNN and hybrid models respectively.