Abstract

ObjectiveTo propose a new vector-based relatedness metric that derives word vectors from the intrinsic structure of biomedical ontologies, without consulting external resources such as large-scale biomedical corpora. Materials and MethodsSNOMED CT on the mapping layer of UMLS was used as a testbed ontology. Vectors were created for every concept at the end of all semantic relations—attribute–value relations and descendants as well as is_a relation—of the defining concept. The cosine similarity between the averages of those vectors with respect to each defining concept was computed to produce a final semantic relatedness. ResultsTwo benchmark sets that include a total of 62 biomedical term pairs were used for evaluation. Spearman’s rank coefficient of the current method was 0.655, 0.744, and 0.742 with the relatedness rated by physicians, coders, and medical experts, respectively. The proposed method was comparable to a word-embedding method and outperformed path-based, information content-based, and another multiple relation-based relatedness metrics. DiscussionThe current study demonstrated that the addition of attribute relations to the is_a hierarchy of SNOMED CT better conforms to the human sense of relatedness than models based on taxonomic relations. The current approach also showed that it is robust to the design inconsistency of ontologies. ConclusionUnlike the previous vector-based approach, the current study exploited the intrinsic semantic structure of an ontology, precluding the need for external textual resources to obtain context information of defining terms. Future research is recommended to prove the validity of the current method with other biomedical ontologies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call