Abstract Vision is the main sensory organ for human beings to contact and understand the objective world. The results of various statistical data show that more than 60% of all ways for human beings to obtain external information are through the visual system. Vision is of great significance for human beings to obtain all kinds of information needed for survival, which is the most important sense of human beings. The rapid growth of computer technology, image processing, pattern recognition, and other disciplines have been widely applied. Traditional image processing algorithms have some limitations when dealing with complex images. To solve these problems, some scholars have proposed various new methods. Most of these methods are based on statistical models or artificial neural networks. Although they meet the requirements of modern computer vision systems for feature extraction algorithms with high accuracy, high speed, and low complexity, these algorithms still have many shortcomings. For example, many researchers have used different methods for feature extraction and segmentation to get better segmentation results. Scale-invariant feature transform (SIFT) is a description used in the field of image processing. This description has scale invariance and can detect key points in the image. It is a local feature descriptor. A sparse coding algorithm is an unsupervised learning method, which is used to find a set of “super complete” basis vectors to represent sample data more efficiently. Therefore, combining SIFT and sparse coding, this article proposed an image feature extraction algorithm based on visual information to extract image features. The results showed that the feature extraction time of X algorithm for different targets was within 0.5 s when the other conditions were the same. The feature matching time was within 1 s, and the correct matching rate was more than 90%. The feature extraction time of Y algorithm for different targets was within 2 s. The feature matching time was within 3 s, and the correct matching rate was between 80 and 90%, indicating that the recognition effect of X algorithm was better than that of Y algorithm. It indicates the positive relationship between visual information and image feature extraction algorithm.
Read full abstract