Abstract

As one of the indispensable components in convolutional neural network (CNN), loss function assists in updating parameters of CNN models during the training phase. Generally, different loss functions can assist convolutional neural network (CNN) to learn different feature representations, and different feature representations can be treated as different knowledge learned from objects. In this paper we introduce a novel training framework, namely Loss Transferring (LT), to improve the generalization ability of CNN. LT contains multiple training phases, and each training phase uses a different loss function. Under this framework, CNN models can combine different knowledge of objects by transferring the knowledge learned via one loss function to another. LT contains two components, i.e., loss function set and training strategy. In order to build appropriate loss function set, we establish two basic guides. And according to these basic guides, we design a new loss function in the last layer of CNN models (layer before softmax operation), namely Near Classifier Hyper-Plane (N-CHP) loss, which makes the learned object features belonging to the same category have the minimum intra-class distance and be near the classifier hyper-plane. Based on the two loss function set $\{\text {MSE, softmax}\}$ and $\{\text {N-CHP, softmax}\}$ , we setup two specific training methods, $LT_{\text {MSE, softmax}}$ and $LT_{\text {N-CHP, softmax}}$ , which can be universally applied to different CNN models with low additional computation cost. Meanwhile, two training strategies, multi-phase strategy 1 and multi-phase strategy 2, are further proposed to improve the training efficiency of LT. Extensive experimental results on shallow, moderate and deep models with four benchmark datasets, including MNIST, SVHN, CIFAR-10 and CIFAR-100, demonstrate that CNN models can bring obvious performance improvements when working with $LT_{\text {MSE, softmax}}$ and $LT_{\text {N-CHP, softmax}}$ , which verifies the effectiveness of LT and the proposed two basic guides.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call