A modern FPGA often contains an ASIC-like clocking architecture which is crucial to achieve better skew and performance. Existing conventional FPGA placement algorithms seldom consider clocking resources, and thus may lead to clock routing failures. To address the special FPGA clocking architecture, this paper presents a novel clock-aware placement algorithm for large-scale heterogeneous FPGAs. Our algorithm consists of three major stages: (1) a nonlinear global placement framework with clock fence region construction, (2) a clock-aware packing scheme, and (3) clock-aware legalization and detailed placement. We evaluate our results based on the 2017 ISPD Clock-Aware Placement Contest benchmark suite. Compared with the top three winners, the results show that our algorithm achieves the best overall routed wirelength. On average, our algorithm outperforms the top-3 winners by 3.6%, 7.5%, and 12.9% in routed wirelength, respectively.