Piecewise polynomial approximation (PPA) on nonlinear functions plays an important role in high-precision computing. In this article, we proposed QPA, an integration of error-flattened quantization-aware PPA methods, to generate the optimized coefficients for efficient hardware implementations targeting any polynomial order. QPA incorporated four key features to minimize the fitting error and the hardware cost, including using the Remez algorithm to compute the minimax fitting polynomial, combining the fitting and quantization operations to get an error-flattened characteristic, assigning specific coefficient bit width to each multiplier to reduce the hardware cost, and fine-tuning the truncated coefficients to further reduce the fitting error. Experimental results showed that our methods consistently achieved the lowest fitting error compared with the state-of-the-art error-flattened piecewise approximation methods. We synthesized the proposed designs with 28-nm TSMC CMOS technology. The results showed that the proposed designs achieved up to 37.0% area reduction and 50.5% power consumption reduction compared to the state-of-the-art error-flattened piecewise linear (PWL) method, and up to 27.0% area reduction, 21.4% delay reduction, and 20.8% power consumption reduction compared to the state-of-the-art error-flattened piecewise quadratic (PWQ) method.