As inexpensive imaging chips and wireless telecommunications are incorporated into an increasing array, of portable products, the need for high efficiency, high throughput embedded processing will become an important challenge in computer architecture. Videocentric applications, such wireless videoconferencing, real-time video enhancement and analysis, and new, immersive modes of distance education, will exceed the computational capabilities of current microprocessor and digital signal processor (DSP) architectures. A new class of embedded computers, portable video supercomputers, will combine supercomputer performance with the energy efficiency required for deployment in portable systems. We examine one candidate portable video supercomputer, a low memory, monolithically integrated SIMD architecture (SIMPil) that exploits the substantial data parallelism that exists in a suite of implemented video processing applications. The processing element microarchitecture is optimized using a novel technique that combines application simulation and technology modeling to provide a desired combination of performance, area, and energy consumption. Analysis results show that, for MPEG encoding, a SIMPil array implemented in 100 nm CMOS provides 100x greater performance and 10x higher energy efficiency than today's DSPs implemented 150 nm CMOS. This is accomplished using execution parallelism and a carefully selected processing element design. This research demonstrates that appropriately designed SIMD arrays, implemented monolithically in today's technology, can provide high performance and high efficiency for embedded video processing.
Read full abstract