Abstract

In recent years, real-valued neural networks have demonstrated promising, and often striking, results across a broad range of domains. This has driven a surge of applications utilizing high-dimensional datasets. While many techniques exist to alleviate issues of high-dimensionality, they all induce a cost in terms of network size or computational runtime. This work examines the use of quaternions, a form of hypercomplex numbers, in neural networks. The constructed networks demonstrate the ability of quaternions to encode high-dimensional data in an efficient neural network structure, showing that hypercomplex neural networks reduce the number of total trainable parameters compared to their real-valued equivalents. Finally, this work introduces a novel training algorithm using a meta-heuristic approach that bypasses the need for analytic quaternion loss or activation functions. This algorithm allows for a broader range of activation functions over current quaternion networks and presents a proof-of-concept for future work.

Highlights

  • A perceptron is composed of several threshold logic units (TLUs), each of which takes a weighted sum of input values and uses the resulting sum as the input to a non-linear activation function

  • Examples of custom architectures include convolutional neural networks (CNNs) for processing image data, recurrent neural networks (RNNs) for processing sequence data, and generative adversarial networks (GANs) which have been used in recent years to create deep fakes and very convincing counterfeit data [8]

  • The function approximation task served as a proof-of-concept for the Quaternion Multilayer Perceptron (QMLP)-genetic algorithm (GA)

Read more

Summary

Neural Networks and Multi-Layer Perceptrons

Statistical learning processes have received increasing attention in recent years with the proliferation of large datasets, ever-increasing computing power, and simplified data exploration tools. While each TLU computes a linear combination of the inputs based on the network weights, the use of a non-linear activation function allows the perceptron to estimate a number of non-linear functions by adjusting the weights of each input. Two contemporaries, Cybenko [2] and Hornik et al [3] Both independently showed that a network with a single hidden layer and sigmoidal activation functions is able to approximate any nonlinear function to an arbitrary degree of accuracy. This network structure is called the multilayer perceptron (MLP) and it forms the most basic deep neural network (DNN). A representation of an MLP is shown in Figure 1, and [4] provides an overview of MLPs and other common neural network structures

The Backpropagation Algorithm
Shortfalls
The Quaternions
Quaternion Algebra
Quaternionic Matrices
A Note on Quaternion Calculus and Quaternionic Analysis
Quaternion Neural Networks
Metaheuristic Optimization Techniques
Methodology
Test Functions
The Ackley Function
The Lorenz Attractor Chaotic System
Function Approximation
Chaotic Time Series Prediction
Quaternion Genetic Algorithm
Evaluation and Analysis Strategy
Results
Function Approximation Results
Time Series Prediction Results
Discussion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call