Abstract

The Visual Geometry Group-16 (VGG16) network architecture, as part of the development of convolutional neural networks, has been popular among researchers in solving classification tasks, so in this paper, we investigated the number of layers to find better performance. In addition, we also proposed two pooling function techniques inspired by existing research on mixed pooling functions, namely Qmax and Qavg. The purpose of the research was to see the advantages of our method; we conducted several test scenarios, including comparing several modified network configurations based on VGG16 as a baseline and involving our pooling technique and existing pooling functions. Then, the results of the first scenario, we selected a network that can adapt well to our pooling technique, whichwas then carried out several tests involving the Cifar10, Cifar100, TinyImageNet, and Street View House Numbers (SVHN) datasets as benchmarks. In addition, we were also involved in several existing methods. The experiment results showed that Net-E has the highest performance, with 93.90% accuracy for Cifar10, 71.17% for Cifar100, and 52.84% for TinyImageNet. Still, the accuracy was low when the SVHN dataset was used. In addition, in comparison tests with several optimization algorithms using the Qavg pooling function, it can be seen that the best accuracy results lie in the SGD optimization algorithm, with 89.76% for Cifar10 and 89.06% for Cifar100.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call