Abstract

In the big data era, data and information processing is a common concern of diverse fields. To achieve the two keys “efficiency” and “intelligence” to the processing process, it’s necessary to search, define and build the potential links among heterogeneous data. Focusing on this issue, this paper proposes a knowledge-driven method to calculate the semantic similarity between (bilingual English-Chinese) words. This method is built on the knowledge base “HowNet”, which defines and maintains the “atom taxonomy tree” and the “semantic dictionary” - a network of knowledge system describing the relationships between word concepts and attributes of the concepts. Compared to other knowledge bases, HowNet pays more attention to the connections between words based on concepts. Besides, this method is more complete in the analysis of concepts and more convenient in calculation methods. The non-relational database MongoDB is employed to improve the efficiency and fully use the rich knowledge maintained in HowNet. Considering both the structure of HowNet and characteristics of MongoDB, a certain number of equations are defined to calculate the semantic similarity.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.