Abstract

The internet has been one of the fast-growing technologies allowing users to search for information, communicate and collaborate, social networking, promote business, and much more. There are over 1 billion sites on the world-wide web <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">1</sup> . It is important to understand the user’s query and return the top results that are majorly related to the query. In Search Result Clustering, results returned by one or more search engines are clustered into meaningful groups.The Search Results Clustering problem is modelled as a multi-view clustering problem in this study. The problem is then solved using an optimization framework. Experiments have been performed considering various combinations of four views including word-mover distance, BERT, TFIDF, and universal sentence encoder. The multi-objective binary differential evolution (MBDE) approach is used to optimise multiple quality measures computed on clusters produced by different views at the same time. We considered a set of possible solutions where each solution represents partitioning information obtained using different views. An agreement index that checks the consistency of partitionings of multiple views is also optimised. To get a single ensembled partitioning after considering various views, consensus partitioning technique and generative modeling are explored. The proposed model can automatically detect the number of clusters for any query. Our approach outperforms by 3.33% F1-score on three widely-used datasets over state-of-the-art techniques.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.