With increasing complexity of smart city energy systems and rising energy demand, effective energy management solutions are crucial. Buildings now incorporate renewable energy sources and battery storage for efficient energy utilization, making optimal control strategies important. Compared to rule-based controllers and model-based methods, swarm and evolutionary algorithms have the advantages of providing cost-effective, stable, and scalable alternatives. However, their potential in data-rich environments and multi-building energy systems remains underexplored. This research bridges this gap by demonstrating the cooperative capabilities of population-based optimization agents for efficient energy management. Specifically, novel algorithm and framework, named Swarm Optimized Agents for Sequential Decision Making (SWOAM) is proposed. It combines an online k-means classifier and a swarm optimizer for optimal control of multi-building energy systems. By designing an online K-means learning strategy and an agent-based sequential episodic sampling approach, it is feasible to train a swarm of agents find an optimal energy management policy for buildings in a district. For training and evaluation, we use the standardized CityLearn building energy management environment. The agents are evaluated on three key metrics: electricity cost, carbon dioxide emissions and grid ramping. SWOAM delivers state-of-the-art performance and outperforms modern reinforcement learning and rule-based controllers.