Neural Network Operators Research Articles

Convolution is an important operation in neural networks which, in recent years, received significant attention from the researchers thanks to its ability to handle complex tasks such as image processing, computer vision in an efficient manner. In general, the convolution operation in neural networks considers two matrices as inputs: an image matrix representing an image and a kernel matrix required for necessary image processing operation and performs several multiplications and addition operations among the elements of image and kernel matrices. Realizing a circuit structure for matrix–matrix convolution is straightforward as each multiplication is realized by a multiplier, whereas an addition is carried out by an adder. However, the corresponding circuits result in large area, high power consumption and long delay because of the large number of multiplications and additions that are involved in the matrix–matrix convolution operations. While, the existing approaches focus on the accelerations of this computationally intensive tasks, they often do not guarantee minimality of area, power and delay. But we show that there exists design aspects through which the circuit structures for convolution operations can be realized with less area, power and delay. To do this, we consider the kernel definitions during the design of the circuit structures since the kernel matrices are often (pre)-determined based on the desired applications. Motivated by this, we first explore the design space of the convolution operation by introducing an alternative design scheme for realizing the respective operation between two matrices keeping the image processing/neural network applications in mind. Experimental evaluations confirm the potential benefits of the proposed design scheme and demonstrate that the reductions in the area and power by approximately [Formula: see text] and critical path delay by approximately [Formula: see text] can be achieved using the proposed design scheme. In addition, the FPGA implementations of the proposed scheme also show that the reductions of approximately [Formula: see text] and [Formula: see text] in the number of LUTs and in the number of pins, respectively, can be achieved. Compared to prior works, the proposed scheme allows higher parallelism with minimum LUT utilization.

Graph neural networks (GNNs), which extend traditional neural networks for processing graph-structured data, have been widely used in many fields. The GNN computation mainly consists of the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">edge processing to generate messages by combining the edge/vertex features and the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">vertex processing to update the vertex features with aggregated messages. In addition to nontrivial vector operations in the edge processing, huge random accesses and neural network operations in the vertex processing, the graph topology of GNNs may also vary during the computation (i.e., dynamic GNNs). The above characteristics pose significant challenges on existing architectures. In this article, we propose a novel accelerator named CAMBRICON-G for efficient processing of both dynamic and static GNNs. The key of CAMBRICON-G is to abstract the irregular computation of a broad range of GNN variants to the process of regularly tiled <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">adjacent cuboid (which extends the traditional adjacent matrix of graph by adding the dimension of vertex features). The intuition is that the adjacent cuboid facilitates exploitation of both data locality and parallelism by offering <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multidimensional multilevel tiling (including spatial and temporal tiling) opportunities. To perform the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multidimensional spatial tiling , the CAMBRICON-G architecture mainly consists of the cuboid engine (CE) and hybrid on-chip memory. The CE has multiple vertex processing units (VPUs) working in a coordinated manner to efficiently process the sparse data and dynamically update the graph topology with dedicated instructions. The hybrid on-chip memory contains the topology-aware cache and multiple scratchpad memory to reduce off-chip memory access. To perform the <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">multidimensional temporal tiling , an easy-to-use programming model is provided to flexibly explore different tiling options for large graphs. Experimental results show that compared against Nvidia P100 GPU, the performance and energy efficiency can be improved by <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$7.14\times $ </tex-math></inline-formula> and <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$20.18\times $ </tex-math></inline-formula> , respectively, on various GNNs, which validates both the versatility and energy efficiency of CAMBRICON-G.

Neural Network Operators Research Articles

Related Topics

Articles published on Neural Network Operators

Near-Data Processing in Memory Expander for DNN Acceleration on GPUs

Medical imaging deep learning with differential privacy

RISC-V Virtual Platform-Based Convolutional Neural Network Accelerator Implemented in SystemC

A pipelining strategy for accelerating convolution neural networks on ARM CPUs

Design Space Exploration of Matrix–Matrix Convolution Operation

Combining max-pooling and wavelet pooling strategies for semantic image segmentation

The art of molecular computing: Whence and whither.

Demand Response Package Model of Electric Vehicle Charging Station Based on LSTM Neural Network and Optimal Operation of Distribution Network

End-point Temperature Preset of Molten Steel in the Final Refining Unit Based on an Integration of Deep Neural Network and Multi-process Operation Simulation

Approximation by exponential sampling type neural network operators

Predicting mean ribosome load for 5'UTR of any length using deep learning.

Fractional type multivariate neural network operators

Enhancing threshold neural network via suprathreshold stochastic resonance for pattern classification

Ferroelectric Field Effect Transistors as a Synapse for Neuromorphic Application

Model of an Artificial Neural Network for Solving the Problem of Controlling a Genetic Algorithm Using the Mathematical Apparatus of the Theory of Petri Nets

CRBA: A Competitive Rate-Based Algorithm Based on Competitive Spiking Neural Networks.

Radio-Frequency Multiply-and-Accumulate Operations with Spintronic Synapses

Cambricon-G: A Polyvalent Energy-Efficient Accelerator for Dynamic Graph Neural Networks

Research on the Role of Influencing Factors on Hotel Customer Satisfaction Based on BP Neural Network and Text Mining

Automatic Modulation Recognition Based on a DCN-BiLSTM Network.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Neural Network Operators Research Articles

Related Topics

Articles published on Neural Network Operators

Near-Data Processing in Memory Expander for DNN Acceleration on GPUs

Medical imaging deep learning with differential privacy

RISC-V Virtual Platform-Based Convolutional Neural Network Accelerator Implemented in SystemC

A pipelining strategy for accelerating convolution neural networks on ARM CPUs

Design Space Exploration of Matrix–Matrix Convolution Operation

Combining max-pooling and wavelet pooling strategies for semantic image segmentation

The art of molecular computing: Whence and whither.

Demand Response Package Model of Electric Vehicle Charging Station Based on LSTM Neural Network and Optimal Operation of Distribution Network

End-point Temperature Preset of Molten Steel in the Final Refining Unit Based on an Integration of Deep Neural Network and Multi-process Operation Simulation

Approximation by exponential sampling type neural network operators

Predicting mean ribosome load for 5'UTR of any length using deep learning.

Fractional type multivariate neural network operators

Enhancing threshold neural network via suprathreshold stochastic resonance for pattern classification

Ferroelectric Field Effect Transistors as a Synapse for Neuromorphic Application

Model of an Artificial Neural Network for Solving the Problem of Controlling a Genetic Algorithm Using the Mathematical Apparatus of the Theory of Petri Nets

CRBA: A Competitive Rate-Based Algorithm Based on Competitive Spiking Neural Networks.

Radio-Frequency Multiply-and-Accumulate Operations with Spintronic Synapses

Cambricon-G: A Polyvalent Energy-Efficient Accelerator for Dynamic Graph Neural Networks

Research on the Role of Influencing Factors on Hotel Customer Satisfaction Based on BP Neural Network and Text Mining

Automatic Modulation Recognition Based on a DCN-BiLSTM Network.