Brick Assembly Networks: An Effective Network for Incremental Learning Problems

Jiacang Ho,Dae-Ki Kang

doi:10.3390/electronics9111929

Abstract

Deep neural networks have achieved high performance in image classification, image generation, voice recognition, natural language processing, etc.; however, they still have confronted several open challenges that need to be solved such as incremental learning problem, overfitting in neural networks, hyperparameter optimization, lack of flexibility and multitasking, etc. In this paper, we focus on the incremental learning problem which is related with machine learning methodologies that continuously train an existing model with additional knowledge. To the best of our knowledge, a simple and direct solution to solve this challenge is to retrain the entire neural network after adding the new labels in the output layer. Besides that, transfer learning can be applied only if the domain of the new labels is related to the domain of the labels that have already been trained in the neural network. In this paper, we propose a novel network architecture, namely Brick Assembly Network (BAN), which allows a trained network to assemble (or dismantle) a new label to (or from) a trained neural network without retraining the entire network. In BAN, we train labels with a sub-network (i.e., a simple neural network) individually and then we assemble the converged sub-networks that have trained for a single label together to form a full neural network. For each label to be trained in a sub-network of BAN, we introduce a new loss function that minimizes the loss of the network with only one class data. Applying one loss function for each class label is unique and different from standard neural network architectures (e.g., AlexNet, ResNet, InceptionV3, etc.) which use the values of a loss function from multiple labels to minimize the error of the network. The difference of between the loss functions of previous approaches and the one we have introduced is that we compute a loss values from node values of penultimate layer (we named it as a characteristic layer) instead of the output layer where the computation of the loss values occurs between true labels and predicted labels. From the experiment results on several benchmark datasets, we evaluate that BAN shows a strong capability of adding (and removing) a new label to a trained network compared with a standard neural network and other previous work.

Highlights

Deep neural networks [1] have played an important role in many areas of artificial intelligence field, such as image classification and object detection [2,3,4,5], image generation [6,7,8,9], speech recognition [10,11,12], text generation [13,14], etc
The incremental learning problem is worth exploration because most neural network systems have a poor capability in adding new labels to their output layer after the neural network systems have been converged
Roy et al [17] have proposed a hierarchical deep convolutional neural network (TreeCNN) for solving the incremental learning problem by growing a trained network structure if new labels are added to the network

Summary

Introduction

Deep neural networks [1] have played an important role in many areas of artificial intelligence field, such as image classification and object detection [2,3,4,5], image generation [6,7,8,9], speech recognition [10,11,12], text generation [13,14], etc. In order to apply a transfer learning on the image classification problem, we remain the convolutional layers of the neural network and retrain only the fully connected layers of the neural network This solution is more effective than the first solution, it has a restriction that the new label must be from a similar domain of the other labels that have already been trained in the neural network. To address these problems (i.e., time-consuming characteristics of the retraining method and domain restriction limitation of the transfer learning method), we propose a novel network architecture, namely brick assembly network (BAN). We release the implementation of our network architecture (Our scripts are available at https://github.com/canboy123/ban)

Related Work

Preliminaries

Brick Assembly Network

Pseudo-Code of the Brick Assembly Network

Parametric Characteristic Layer

Experiment Settings

Experiment Results and Discussion

Single Dataset

Multiple Datasets

Summary

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Brick Assembly Networks: An Effective Network for Incremental Learning Problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Journal: Electronics	Publication Date: Nov 17, 2020
License type: CC BY 4.0

Similar Papers

Neuroeducational Environment for Acquisition of Competencies in the Field of End-To-End Digital Technologies (Neurotechnology) in the Conditions of Digital Transformation
M E Mazurov ... V A Titov
Otkrytoe Obrazovanie (Moskva) | VOL. 24
M E Mazurov, et. al.M E Mazurov ... V A Titov
28 Dec 2020
Otkrytoe Obrazovanie (Moskva) | VOL. 24

The Essential Tools of Scientific Machine Learning (Scientific ML)
Christopher Rackauckas
-
Christopher RackauckasChristopher Rackauckas
20 Aug 2019
20 Aug 2019

A GPU-Based Training of BP Neural Network for Healthcare Data Analysis
Wei Song ... Simon Fong
-
Wei Song, et. al.Wei Song ... Simon Fong
29 Nov 2018
29 Nov 2018

Diagnosis of dermatophytosis in cats using artificial neural networks
А.А Bushmina ... I.V Kireev
Veterinaria i kormlenie | VOL. -
А.А Bushmina, et. al.А.А Bushmina ... I.V Kireev
01 Feb 2023
Veterinaria i kormlenie | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Brick Assembly Networks: An Effective Network for Incremental Learning Problems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics