Design of Feedforward Neural Networks in the Classification of Hyperspectral Imagery Using Superstructural Optimization

Hasan Sildir,Erdal Aydin,Taskin Kavzoglu

doi:10.3390/rs12060956

Hasan Sildir, Erdal Aydin + Show 1 more

Open Access

https://doi.org/10.3390/rs12060956

Copy DOI

Journal: Remote sensing	Publication Date: Mar 16, 2020
Citations: 20	License type: CC BY 4.0

Affiliation: Gebze Technical University, Boğaziçi University

Abstract

Artificial Neural Networks (ANNs) have been used in a wide range of applications for complex datasets with their flexible mathematical architecture. The flexibility is favored by the introduction of a higher number of connections and variables, in general. However, over-parameterization of the ANN equations and the existence of redundant input variables usually result in poor test performance. This paper proposes a superstructure-based mixed-integer nonlinear programming method for optimal structural design including neuron number selection, pruning, and input selection for multilayer perceptron (MLP) ANNs. In addition, this method uses statistical measures such as the parameter covariance matrix in order to increase the test performance while permitting reduced training performance. The suggested approach was implemented on two public hyperspectral datasets (with 10% and 50% sampling ratios), namely Indian Pines and Pavia University, for the classification problem. The test results revealed promising performances compared to the standard fully connected neural networks in terms of the estimated overall and individual class accuracies. With the application of the proposed superstructural optimization, fully connected networks were pruned by over 60% in terms of the total number of connections, resulting in an increase of 4% for the 10% sampling ratio and a 1% decrease for the 50% sampling ratio. Moreover, over 20% of the spectral bands in the Indian Pines data and 30% in the Pavia University data were found statistically insignificant, and they were thus removed from the MLP networks. As a result, the proposed method was found effective in optimizing the architectural design with high generalization capabilities, particularly for fewer numbers of samples. The analysis of the eliminated spectral bands revealed that the proposed algorithm mostly removed the bands adjacent to the pre-eliminated noisy bands and highly correlated bands carrying similar information.

Highlights

Since the introduction of perceptron by Rosenblatt in 1958 [1], numerous studies in almost all scientific fields have been conducted to apply neural network models and test their performances
In order to show the position of the eliminated inputs, mean spectral signatures of 15 land use/land cover (LULC) classes in the Indian Pines dataset were extracted from the ground reference and the eliminated 53 bands for the 50% sampling ratio were depicted on the figure with vertical lines result, a considerable number of connections were removed from the network
This study investigates the optimal training of multi-layer perceptrons through formulating and solving a mixed-integer non-linear optimization problem, delivering a significant reduction in the number of network connections, neurons, and input variables

Summary

Introduction

Since the introduction of perceptron by Rosenblatt in 1958 [1], numerous studies in almost all scientific fields have been conducted to apply neural network models and test their performances. The motivation in this study is to remove some interconnections or eliminate some hidden layer neurons to improve generalization capabilities, and to reduce the dimension of the input layer by eliminating the least effective and correlated spectral bands, and achieve improved performance This is important for the processing of hyperspectral images that comprise many correlated and sometimes irrelevant spectral bands for the problem under consideration. To the best of authors’ knowledge, this paper is the first application of such an optimal and robust ANN algorithm addressing the classification of remotely sensed imagery In addition to this novel application concept, extra linking constraints are added to this newer formulation that forces the optimization algorithm not to iterate for the continuous variables when certain binary variables are equal to zero, which in turn decreases the computational load of the resulting mixed-integer type ANN related problems significantly

Test Sites and Datasets

The Indian Pines Dataset

Findings

Conclusions