Abstract

It seems that by now enough samples of specialized neural network microprocessor crystals and systems based on them have already been created to indicate the trends of their development, and most importantly, their place in the overall development of supercomputer architectures and technologies. The use of low-bit representations of numbers, such as FP8, INT8, BF16, acceptable in neural network computing, allows, on the one hand, to achieve the performance of the 2015 FP8 TFLOPS, 1008 BF16 TFLOPS crystal, and, on the other hand, to reduce the energy consumption of the multiplication operation. Low bit depth caused attention to rounding errors. In a number of crystals, the set of rounding modes has been expanded in comparison with the generally accepted standard and the possibility of programmatically setting the rounding mode has been introduced. In addition, the validity of the creation of specialized neuroprocessor crystals is due to the use of structural programming elements, in which a computer is programmatically formed for an executable algorithm. Therefore, along with reduction the bit depth and support for processing sparse neural networks, in computing systems created on the basis of SambaNova SN30 RDU, Graphcore Colossus MK2 IPU, Untether AI Boqueria, AWS Trainium1, Tesla Dojo D1, there is the possibility, to some extent, of implementing structural programming of calculations. The sparsity of the processed data caused the abandonment of cache memory and the use of on-chip large scratchpad memory with increased bandwidth for data delivery between memory and arithmetic logic devices, as well as between memory and on-chip and inter-chip communication fabric. Therefore, we can talk about a different hierarchical structure of memory compared to the traditional one using cache memory. Thus, specialization in neural network algorithms has led to the emergence of massively parallel systems architectures for processing low-bit data formats with poor temporal and spatial localization of memory requests.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.