FEED2SEARCH: a framework for hybrid-molecule based semantic search

Nathalie Charbel,Christian Sallaberry,Sebastien Laborie,Richard Chbeir

doi:10.1080/03081079.2023.2195173

Abstract

ABSTRACT Adopting semantic technologies has proven several benefits for enabling a better representation of the data and empowering reasoning capabilities over it. However, there are still unresolved issues, such as the shift from heterogeneous documents to semantic data models and the representation of search results. Thus, in this paper, we introduce a novel F ram E work for hybrid mol E cule-base D SE mantic SEARCH , entitled FEED2SEARCH, which facilitates Information Retrieval over a heterogeneous document corpus. We first propose a semantic representation of the corpus, which automatically generates a semantic graph covering both structural and domain-specific aspects. Then, we propose a query processing pipeline based on a novel data structure for query answers, extracted from this graph, which embeds core information together with structural-based and domain-specific context. This provides users with interpretable search results, helping them understand relevant information and track cross document dependencies. A set of experiments conducted using real-world construction projects from the Architecture, Engineering and Construction (AEC) industry shows promising results, which motivates us to further investigate the effectiveness of our proposal in other domains.

Full Text