In the field of convolution neural networks (CNNs), many impressive architectures have been published in recent years. These can be roughly divided into two groups: large-scale models and lightweight models. These large models are characterized by many trainable weights and complex network structures, offering them strong effectiveness in various computer vision tasks, and have become essential components of many modern visual recognition systems. These lightweight CNNs are designed to maintain high performance with limited memory and computational resources. They are highly efficient in terms of inference time and resource utilization, so that particularly suitable for mobile and edge computing devices. This work focuses on some prominent models based on the ImageNet database and explores the reasons for their frameworks success. By analyzing these models, a trend could be identified in the development of CNN models: Reasonably increasing the scale of the model and utilizing suitable frameworks can both improve accuracy and efficiency.
Read full abstract