In image retrieval and visual identification, the bag-of-visual-words model has been extensively employed. The process of creating visual words—adding visual words to the associated local characteristics in a high-dimensional space—takes the longest to complete in order to acquire this representation. In order to save time, structures constructed around multi-branch trees & forests have recently been implemented. However, without a lot of backtracking, these approaches are unable to work well. In this study, we demonstrate that the lengthy process of visual word formation may be greatly sped up while keeping accuracy by taking into account the spatial correlation of local variables. We may create a co-occurrence table for each visual word in a huge data collection since some visual terms commonly co-occur with specific structures. We are able to allocate a probabilistic weight to each node of an index structure (for example, a KD-tree and a K-means tree) by associating each visual word with a probability in accordance with the corresponding co-occurrence table. This allows us to re-direct the searching path to be nearby to its global optimum within a minimal number of backtrackings. On the Oxford data set, we carefully examine the proposed scheme by contrasting it with the fast library for approximative nearest neighbours along with the random KD-trees. Extensive experimental findings point to the new scheme's effectiveness and efficiency. FeatureMatch, a generalised approximate nearest-neighbor field (ANNF) calculation framework connecting a source and target picture, is the solution we provide in this study. The suggested approach can calculate ANNF maps between any picture pairings, including unrelated ones. By using the proper spatial-range transformations, this generalisation is accomplished. Global colour adaptation is used on the source picture as a range transform to calculate ANNF maps. Low-dimensional characteristics are employed to approximate the image patches from the pair of photos, and KD-tree and ANNF estimation are combined. Based on picture coherency and spatial transformations, this ANNF map is further enhanced. We can now handle a broader spectrum of vision applications thanks to the suggested generalisation, which we couldn't previously manage with the ANNF framework.
Read full abstract