Ultra high definition television (UHDTV) imposes extremely high throughput requirement on video encoders based on High Efficiency Video Coding (H.265/HEVC) and Advanced Video Coding (H.264/AVC) standards. Context-adaptive binary arithmetic coding (CABAC) is the entropy coding component of these standards. In very-large-scale integration implementation, CABAC has known difficulties in being effectively pipelined and parallelized, due to the critical bin-to-bin data dependencies in its algorithm. This paper addresses the throughput requirement of CABAC encoding for UHDTV applications. The proposed optimizations including prenormalization, hybrid path coverage and lookahead rLPS to reduce the critical path delay of binary arithmetic encoding (BAE) by exploiting the incompleteness of data dependencies in rLPS updating. Meanwhile, the number of bins BAE delivers per clock cycle is increased by the proposed bypass bin splitting technique. The context modeling and binarization components are also optimized. As a result, our CABAC encoder delivers an average of 4.37 bins per clock cycle. Its maximum clock frequency reaches 420 MHz when synthesized in 90 nm. The corresponding overall throughput is 1836 Mbin/s that is 62.5% higher than the state-of-the-art architecture.