Search Results Clustering (SRC) is a well-known problem in the field of information retrieval and refers to the clustering of web-snippets for a given query based on some similarity/dissimilarity measure. In this current study, we have posed Search Results Clustering problem as a multi-view clustering problem and solved it from an optimization point of view. Various views based on syntactic and semantic similarity measures were considered while performing the clustering. In contrast to existing algorithms, three new views based on word mover distance, textual-entailment, and universal sentence encoder, measuring semantics while performing clustering, are incorporated in our framework. Different quality measures computed on clusters generated by different views are optimized simultaneously using multi-objective binary differential evolution (MBDE) framework. MBDE comprises a set of solutions and each solution is composed of two parts corresponding to different views. An agreement index checking the accordance between partitionings of different views is also optimized to obtain a consensus partitioning. The proposed approach is automatic in nature as it is capable of detecting the number of clusters for any query in an automatic way. Experiments are performed on three benchmark multi-view datasets corresponding to web search results and evaluated using well-known F-measure metric. Results obtained illustrate that our approach outperforms state-of-the-art techniques.
Read full abstract