Unlike the computational mechanisms of the past many decades, that involved individual (extremely powerful) computers or clusters of machines, cloud computing (CC) is becoming increasingly pertinent and popular. Computing resources such as CPU and storage are becoming cheaper, and the servers themselves are becoming more powerful. This enables clouds to host more virtual machines (VMs). A natural consequence of this is that many modern-day data centers experience very high internal traffic within the data centers themselves. This is, of course, due to the occurrence of servers that belong to the same tenant, communicating between themselves. The problem is accentuated when the VM deployment tools are not traffic-aware. In such cases, the VMs with high mutual traffic often end up being far apart in the data center network, forcing them to communicate over unnecessarily long distances. The consequent traffic bottlenecks negatively affect both the performance of the application and the network in its entirety, posing non-trivial challenges for the administrators of these cloud-based data centers.The problem, and consequently the solution, can, quite naturally, be compartmentalized into two phases which follow each other. In the first, the task is to consolidate VMs into clusters, where those that communicate with each other fall into the same cluster. The second phase assigns these clusters onto the available server racks. Both of these phases must be executed in a traffic-aware manner. This paper provides efficient intelligent solutions for both these phases. First of all, the VMs are consolidated with a VM clustering algorithm, and this is achieved by utilizing the toolbox involving Learning Automata (LA). By mapping the clustering problem onto the Graph Partitioning (GP) problem, our LA-based solution successfully reduces the total communication cost by amounts that range between 34% and 85%. Thereafter, the resulting clusters are assigned to the server racks using a cluster placement algorithm that involves a completely different intelligent strategy, i.e., one that invokes Simulated Annealing (SA). This phase further reduces the total cost of communication by amounts that range between 89% and 99%. The analysis and results for different models and topologies demonstrate that the optimization is done in a fast and computationally-efficient way.