Abstract

Over the past few years, fiber bandwidth has increased faster than both processor clock frequency and memory access speed. To solve this bottleneck, network processor design has relied on parallelization of tasks via heterogeneous processing elements and by utilizing dedicated high speed hardware blocks for acceleration. However these solutions either reduce the flexibility of network processors or become progressively more difficult to implement due to issues such as memory bandwidth, processor interconnection, etc. To this end, we analyze the fundamental components of a network processor, the processing engines (PEs). By examining the workload undertaken by a modern network processor we determine the processing complexity of network applications along with a detailed instruction mix analysis. Through simulation, our analysis finds that although the average processing cost per packet can allow us to estimate the workload of a particular application, the varying nature of network processor tasks requires two methods of determining if a particular function can be sustained. For those tasks operate independent of the packet length, utilizing the packet header only, we find that the maximum processing cost encountered by any packet is the important metric. Similarly, when analyzing those tasks which require access to the packet payload, we find that average packet cost complexity do not present an efficient method of estimating instruction budgets, since the vast majority of packets deviate from the mean packet length. At an architectural level, our analysis of instruction mix and traces find that as well as increasing parallelization, improvements within process engine performance can be found in a number of areas. We find floating point and multiply units are underutilized within network applications, with more cost effective solutions such as shared resources more suited to network processor design space. Secondly, the byte-wise nature and high number of programming variables hint to the need for a large register base. Finally, conditional operations within a network processor are found to be relatively simple nested loops and if/else bit tests, however the high proportion of conditional branches found through simulation highlight a possible future bottleneck within network processor research.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.