Auto-scaling systems can dynamically scale the required resources for cloud-based services at runtime. This is an effective mechanism, enabling services to adapt to environmental changes. These systems establish the foundation for achieving elasticity in the modern cloud computing paradigm. Given the dynamic and uncertain nature of the shared cloud infrastructure, cloud auto-scaling systems are one of the most complex and sophisticated created artifacts, aiming to achieve self-aware, self-adaptive, and dependable runtime scaling. To find an effective solution to this problem, an accurate prediction of the required amount of workload as well as the system metrics for future time periods are needed. Various solutions have already been proposed to tackle this problem. Many solutions make use of machine learning, statistical, and ensemble methods. In this paper, we view the auto-scaling problem as a sequence model and apply the convolutional neural networks to predict the future workload of cloud services. Also, by using neural networks, we obtain a mapping between the predicted workload as well as the real-time and future amounts of the required resources. We have also proposed a decision-making mechanism that takes into account different and sometimes conflicting user criteria resulting in the best-compromised decision. To this aim, we have used TOPSIS as a multi-criteria decision-making method for the decision-making component. In the evaluation section, we have examined the amount of prediction error, the amount of service level agreement violations, as well as the amount of resources' under-utilization. Evaluations demonstrate that the proposed approach for predicting the workload shows a 4 percent improvement over the existing approaches.