Nowadays, Convolutional Neural Networks (CNN) have been widely adopted for vision-based hand gesture recognition. Several existing CNN architectures designed for gesture classification perform well with high accuracy but require a high memory footprint and processing when deployed on low-power embedded devices. To address this issue, we present a quantized CNN-based efficient framework for meeting real-life hand gesture recognition challenges. The proposed quantized CNN is designed and implemented using the FINN-based pipelined streaming architecture on an FPGA. Moreover, hardware-based optimizations are used to minimize the resources needed and to achieve fast memory access. Experimental results demonstrate that the developed recognition system achieves an average accuracy of 92% on the numeral database of Indian Sign Language (ISL). Additionally, our optimized design attains an inference latency of 0.85ms for real-time single gesture prediction on the PYNQ Zynq Ultrascale (ZU) FPGA, consuming only 3.63 W of power. The proposed design achieves a better trade-off between hardware resource utilization and speed performance, over previous designs.