Abstract
Autonomous and Distributed repositories containing digital documents are maintained and managed independently in accordance to organization's business needs. Documents containing same information in different repositories maybe represented differently, making it hard to retrieve desired information. The information explosion necessitates efficient techniques to unearth the lump of information from hay stack of online digital documents with same and heterogeneous structures. Keyword based information retrieval techniques help in improving the recall of user query result, but has a low precision. To improve precision, we adopt semantic information retrieval technique from digital documents using ontology and maintain dynamic and evolving domain ontology to accommodate the retrieved information. We followed searching technique using thematic similarity approach to enhance the precision of search results. We propose a comprehensive architecture for semantic based information retrieval and search. Plain text is read semantically and the extracted metadata is stored for later use to answer user queries. Triple-centric technique is used for maintaining source metadata (in case of system crash) and probing user queries for capturing the context of the keywords. Semantic based information retrieval and annotation technique precision and recall results are very promising. Semantic search using thematic similarity approach proves to have better precision and recall than previous keyword based searching techniques.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.