Abstract

Per-core local (scratchpad) memories allow direct inter-core communication, with latency and energy advantages over coherent cache-based communication, especially as CMP architectures become more distributed. A multicore FPGA platform with cache-integrated network interfaces (NIs) is presented, appropriate for scalable multicores, that combine the best of two worlds –the flexibility of caches (using implicit communication) and the efficiency of scratchpad memories (using explicit communication): on-chip SRAM is configurable shared among caching, scratchpad, and virtualized NI functions. The proposed system has been implemented in a four-core FPGA. Special hardware primitives (counter, queues) are used for the the communication and synchronization of the cores that are most suitable in network processing applications. The paper presents the performance evaluation of the proposed system in the domain of network processing. Two representatives benchmarks are used, one for header processing and one for payload processing. The system is evaluated in terms of performance and the communication overhead is measured. Furthermore, two approaches for the communication of the processors are evaluated and compared, common queue and distributed queues.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.