At present, floating-point operations are used as add-on functions in critical embedded systems, such as physics, aerospace system, nuclear simulation, image and digital signal processing, automatic control system and optimal control and financial, etc. However, floating-point division is slower than floating-point multiplication. To solve this problem, many existing works try to reduce the required number of iterations, which exploit large Look Up Table (LUT) resource to achieve approximate mantissa of a quotient. In this paper, we propose a novel prediction algorithm to achieve an optimal quotient by predicting certain bits in a dividend and a divisor, which reduces the required LUT resource. Therefore, the final quotient is achieved by accumulating all predicted quotients using our proposed prediction algorithm. The experimental results show that only 3 to 5 iterations are required to obtain the final quotient in a floating-point division computation. In addition, our proposed design takes up 0.84% to 3.28% (1732 LUTs to 6798 LUTs) and 5.04% to 10.08% (1916 (ALUT) to 3832 (ALUT)) when ported to Xilinx Virtex-5 and Altera Stratix-III FPGAs, respectively. Furthermore, our proposed design allows users to track remainders and to set customized thresholds of these remainders to be compatible with a specific application.