Abstract
The implementation of a 16*16 discrete cosine transform (DCT) chip using a concurrent architecture is presented. The chip contains 32 processing elements working in parallel and a random-access memory (RAM) which performs a 16*16 matrix transposition. The structure is highly regular and modular, and thus very efficient for VLSI implementation. The chip was designed for real-time processing of 14.3-MHz sample video data. It performs an equivalent of a half billion multiplications and accumulations per second. Fabricated in 2- mu m double-metal CMOS technology, the chip contains approximately 73000 transistors which occupy a 7.2*7.0-mm/sup 2/ area. The 68-pad die size is 8.3*8.1 mm/sup 2/. It is fully functional and is the first working 16*16 DCT chip. The architecture and accuracy studies for finite-wordlength processing are presented. The circuit design and layout using the symbolic design tool MULGA are described in detail. Possible variations are also discussed for multipurpose (variable transform sizes, forward-inverse transform) applications. >
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.