The content-based image retrieval (CBIR) based on sparse visual features is a challenging research problem to categorize images into semantically meaningful classes. Different robust feature representation methods have been proposed to address the semantic gap issue of CBIR. In this article, we propose a novel robust image representation for CBIR, which is based on complementary visual words integration of speeded up robust features (SURF) and co-occurrence histograms of oriented gradients (CoHOG). The SURF is a local feature descriptor, while CoHOG is a global feature descriptor. The local features give better performance for images belonging to different semantic classes and having close visual appearance among their visual contents, while global features give better performance to retrieval images at large scale. To ensure image retrieval accuracy, the proposed method build two smaller size dictionaries, each containing visual words of SURF and CoHOG descriptors, which are assimilated to form one larger size dictionary. In the CBIR system, a dictionary of smaller size produces better sensitivity, while a dictionary of larger sizes produces better specificity. The dictionary of the larger sizes also yields an overfitting problem to affect CBIR performance, which is addressed by introducing a linear discriminant analysis (LDA) and relevance feedback with the proposed method. The comparative performance analysis of the proposed method is performed with the competitor CBIR methods (i.e., feature integration of SURF–CoHOG descriptors’ based on the bag-of-visual-words (BoVW), a single feature of SURF–BoVW and CoHOG–BoVW methods). The quantitative and qualitative analysis carried out on four standard image databases proves the robust performance of the proposed method as compared to its competitor and recent CBIR methods.