Deep Net Tree Structure for Balance of Capacity and Approximation Ability

Charles K Chui,Ding-Xuan Zhou,Shao-Bo Lin

doi:10.3389/fams.2019.00046

Abstract

Deep learning has been successfully used in various applications including image classification, natural language processing and game theory. The heart of deep learning is to adopt deep neural networks (deep nets for short) with certain structures to build up the estimator. Depth and structure of deep nets are two crucial factors in promoting the development of deep learning. In this paper, we propose a novel tree structure to equip deep nets to compensate the capacity drawback of deep fully connected neural networks (DFCN) and enhance the approximation ability of deep convolutional neural networks (DCNN). Based on an empirical risk minimization algorithm, we derive fast learning rates for deep nets.

Highlights

Deep learning [1], a learning strategy based on deep neural networks, has recently made significant breakthrough on bottlenecks of classical learning schemes, such as support vector machines, random forests and boosting algorithms, by demonstrating its remarkable success in such research areas as computer vision [2], speech recognition [3], and game theory [4]
The necessity of depth has been rigorously verified from the viewpoints of approximation theory and representation theory, via showing the advantages of deep nets in localized approximation [6], sparse approximation in the frequency domain [7, 8], sparse approximation in the spatial domain [9], manifold learning [10, 11], hierarchical structures grasping [12, 13], piecewise smoothness realization [14], universality with bounded number of parameters [15, 16] and rotation invariance protection [17]
We propose an appropriate structure to equip deep nets with a combination of some smaller variance provided by deep convolutional neural networks (DCNN) and a corresponding less bias advantage of deep fully connected neural networks (DFCN)

Summary

INTRODUCTION

Deep learning [1], a learning strategy based on deep neural networks (deep nets), has recently made significant breakthrough on bottlenecks of classical learning schemes, such as support vector machines, random forests and boosting algorithms, by demonstrating its remarkable success in such research areas as computer vision [2], speech recognition [3], and game theory [4]. In designing learning algorithms, and may result in large variance Equipping deep nets with an appropriate structure to reduce the number of parameters of DFCN while enhancing the approximation ability of DCNN requires some desirable balance of the bias and variance in the learning process. Deep nets with tree structures, revealed by our study, possess three theoretical advantages, namely: the capacity, as measured by the covering number, is much smaller than that of DFCN; based on tree structures, the approximation capability is comparable with that of DFCN; and fast learning rate is achieved, by applying an empirical risk minimization algorithm

DEEP NETS WITH TREE STRUCTURES

ADVANTAGES OF DEEP NETS WITH TREE STRUCTURES

GENERALIZATION ERROR ESTIMATES FOR DEEP NETS

PROOFS OF MAIN RESULTS

CONCLUSION