Abstract
Recently, several multimedia applications, such as digital image compression, digital video compression, and digital audio processing, are being executed on embedded devices. Then, processing cores in such embedded devices are required to have high performance and programmability. In general, multimedia applications and machine learning algorithms consist of repeated arithmetic or logic operations and table-lookup coding operations. In order to improve the processing speed of these two operations, a CAMX (Content Addressable Memory-based massive-parallel SIMD matriX core) has been proposed. The role of the CAMX is to be an accelerator for a CPU core. The CAMX has several processing elements for highly parallel processing capability and consists of two CAM modules for fast table-lookup processing. In this paper, we have implemented the self-organizing map algorithm and compared it with Raspberry Pi. The CAMX is about 2.6 times faster than an Arm processor with NEON at the same frequency. In addition, the CAMX can reduce its operating frequency to one-third, if the CAMX and Arm processor with NEON run at the same processing speed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.