MNIST Research Articles

The adoption of transformer networks has experienced a notable surge in various AI applications. However, the increased computational complexity, stemming primarily from the self-attention mechanism, parallels the manner in which convolution operations constrain the capabilities and speed of convolutional neural networks (CNNs). The self-attention algorithm, specifically the matrix-matrix multiplication (MatMul) operations, demands a substantial amount of memory and computational complexity, thereby restricting the overall performance of the transformer. This paper introduces an efficient hardware accelerator for the transformer network, leveraging memristor-based in-memory computing. The design targets the memory bottleneck associated with MatMul operations in the self-attention process, utilizing approximate analog computation and the highly parallel computations facilitated by the memristor crossbar architecture. Remarkably, this approach resulted in a reduction of approximately 10 times in the number of multiply-accumulate (MAC) operations in transformer networks, while maintaining 95.47% accuracy for the MNIST dataset, as validated by a comprehensive circuit simulator employing NeuroSim 3.0. Simulation outcomes indicate an area utilization of 6895.7 μm2\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\mu m^2$$\\end{document}, a latency of 15.52 seconds, an energy consumption of 3 mJ, and a leakage power of 59.55 μW\\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{mathrsfs} \\usepackage{upgreek} \\setlength{\\oddsidemargin}{-69pt} \\begin{document}$$\\mu W$$\\end{document}. The methodology outlined in this paper represents a substantial stride towards a hardware-friendly transformer architecture for edge devices, poised to achieve real-time performance.

Read full abstract

Due to the distributed nature of federated learning, it is vulnerable to poisoning attacks during the training process. The model’s resistance to poisoning attacks can be improved using robust aggregation algorithms. Current research on federated learning to resist poisoning attacks is mainly based on two settings: No trust or Byzantine robustness. However, both settings are not close enough to reality in practical scenarios. In many practical applications, some participants in federated learning are trustworthy. For example, participants who have participated in the training of this model before and performed very well, or participants with strong compliance and credibility such as governments and some national agencies participate in the training. In existing research, these trusted participants still have to accept the judgment of the aggregation node, which generates unnecessary computation, increases overhead, and does not take advantage of a trusted environment. Since there is no attack behavior on the trusted client, its training results are used to classify the trustworthiness of other untrusted clients and identify attack nodes with higher accuracy. Therefore, this paper proposes a robust federated learning algorithm for partially trusted environments. The proposed scheme uses the experimental results of trusted clients to judge the behavior of untrustworthy clients by the cosine similarity and the Local Outlier Factor and further identify and detect malicious clients. Experiments are performed on MNIST and CIFAR datasets. Comparison with other six aggregation algorithms under 30% attack scenario. And compared with the other four aggregation algorithms under 70% attack conditions. Our algorithm is more accurate than almost all of the other aggregation algorithms. The paper is the first to conduct robust research on federated learning in a partially trusted environment, and the proposed algorithm can more effectively resist poisoning attacks.

Read full abstract

MNIST Research Articles

Articles published on MNIST

Efficient memristor accelerator for transformer self-attention functionality

Classification with electromagnetic waves

State-of-the-Art Results with the Fashion-MNIST Dataset

Fractional-order spike-timing-dependent gradient descent for multi-layer spiking neural networks

Out-of-Distribution Detection with Memory-Augmented Variational Autoencoder

A new perspective on optimizers: leveraging moreau-yosida approximation in gradient-based learning

All-optical reconfigurable optical neural network chip based on wavelength division multiplexing

Mixed precision quantization of silicon optical neural network chip

A complex neural network model by Hilbert Transform

Intelligent Handwritten Identification Using Novel Hybrid Convolutional Neural Networks – Long-short-term Memory Architecture

VFLGAN: Vertical Federated Learning-based Generative Adversarial Network for Vertically Partitioned Data Publication

Asymmetric Spatiotemporal Online Learning for Deep Spiking Neural Networks

A robust federated learning algorithm for partially trusted environments

Multi-adversarial autoencoders: Stable, faster and self-adaptive representation learning

SecureEI: Proactive intellectual property protection of AI models for edge intelligence

Supervised structure learning

A multi‐dimensional incentive mechanism based on age of update in hierarchical federated learning

Development of a conditional variational autoencoder for handwritten digit recognition.

Comparative analysis of CNN models for handwritten digit recognition

Reconfigurable Five-In-One Carbon Nanotube Optoelectronic Transistor for Intelligent Computing and Communication.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

MNIST Research Articles

Articles published on MNIST

Efficient memristor accelerator for transformer self-attention functionality

Classification with electromagnetic waves

State-of-the-Art Results with the Fashion-MNIST Dataset

Fractional-order spike-timing-dependent gradient descent for multi-layer spiking neural networks

Out-of-Distribution Detection with Memory-Augmented Variational Autoencoder

A new perspective on optimizers: leveraging moreau-yosida approximation in gradient-based learning

All-optical reconfigurable optical neural network chip based on wavelength division multiplexing

Mixed precision quantization of silicon optical neural network chip

A complex neural network model by Hilbert Transform

Intelligent Handwritten Identification Using Novel Hybrid Convolutional Neural Networks – Long-short-term Memory Architecture

VFLGAN: Vertical Federated Learning-based Generative Adversarial Network for Vertically Partitioned Data Publication

Asymmetric Spatiotemporal Online Learning for Deep Spiking Neural Networks

A robust federated learning algorithm for partially trusted environments

Multi-adversarial autoencoders: Stable, faster and self-adaptive representation learning

SecureEI: Proactive intellectual property protection of AI models for edge intelligence

Supervised structure learning

A multi‐dimensional incentive mechanism based on age of update in hierarchical federated learning

Development of a conditional variational autoencoder for handwritten digit recognition.

Comparative analysis of CNN models for handwritten digit recognition

Reconfigurable Five-In-One Carbon Nanotube Optoelectronic Transistor for Intelligent Computing and Communication.