High-Level Synthesis (HLS) has played a pivotal role in making FPGAs accessible to a broader audience by facilitating high-level device programming and rapid microarchitecture customization through the use of directives. However, manually selecting the right directives can be a formidable challenge for programmers lacking a hardware background. This paper presents CollectiveHLS, an ultra-fast, knowledge-driven approach to optimizing HLS designs. It automates the identification and application of optimal directive configurations from the original source code, focusing on minimizing design latency and ensuring synthesizability. This optimization approach is entirely data-driven, offering a generalized HLS tuning solution without reliance on Quality of Result (QoR) models or meta-heuristics. CollectiveHLS is designed, implemented, and evaluated using around 60 applications sourced from well-established benchmark suites and GitHub repositories, all running on a Xilinx UltraScale + MPSoC ZCU104. It achieves an average geometric mean speedup of up to \(23.1\times\) compared to the official source code without directives, while maintaining synthesizability and feasibility rates of 100% and 96.6%, respectively, matching those of Vitis, the industry-standard framework for FPGA acceleration. Comparisons with resource over-provisioning, traditional genetic algorithm-based Design Space Exploration (DSE), and State-of-the-Art (SotA) approaches demonstrate that CollectiveHLS produces designs of comparable quality \(14.6\times\) faster on average. These results underscore the potential of our approach as an ultra-fast and automated solution for HLS optimization.