Abstract

Object detection and object captioning have become important tasks in various applications, especially in the domains of medical science, social networking, and surveillance. They have become part of an emerging field in this decade due to the abundant and diversified data available on the Internet. Convolutional neural network (CNN) is a major breakthrough in computer vision as it has special design architecture to detect complex features in data. It is a special type of neural network that has shown exemplary performance in image classification and segmentation competitions, too. Essentially, it is a deep learning algorithm that inputs an image, assigns weights and biases to various parts of the image, and is thereby able to perform image detection and classification. CNN reduces the image matrix with the use of filters and applies activation functions (ReLU, tanh, Softmax). Further, it applies a combination of convolutional layers along with pooling and strides to reduce image size and flatten the input for a fully connected network. CNN has the capability to capture spatial and temporal dependencies with the application of various filters. Abundant availability of data from various sources and improvement in hardware has accelerated research in CNN. It has been observed that the use of different optimization functions, the activation function, along with different architectural designs has generated a lot of scope in innovation and new ideas in CNN. Basically, we aim to introduce how CNNs work, an application using Python, and a systematic review of CNNs along with its latest research trends.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call