Semantic Similarity Measures Research Articles

PurposeThe purpose of this paper is to merge the ontologies that remove the redundancy and improve the storage efficiency. The count of ontologies developed in the past few eras is noticeably very high. With the availability of these ontologies, the needed information can be smoothly attained, but the presence of comparably varied ontologies nurtures the dispute of rework and merging of data. The assessment of the existing ontologies exposes the existence of the superfluous information; hence, ontology merging is the only solution. The existing ontology merging methods focus only on highly relevant classes and instances, whereas somewhat relevant classes and instances have been simply dropped. Those somewhat relevant classes and instances may also be useful or relevant to the given domain. In this paper, we propose a new method called hybrid semantic similarity measure (HSSM)-based ontology merging using formal concept analysis (FCA) and semantic similarity measure.Design/methodology/approachThe HSSM categorizes the relevancy into three classes, namely highly relevant, moderate relevant and least relevant classes and instances. To achieve high efficiency in merging, HSSM performs both FCA part and the semantic similarity part.FindingsThe experimental results proved that the HSSM produced better results compared with existing algorithms in terms of similarity distance and time. An inconsistency check can also be done for the dissimilar classes and instances within an ontology. The output ontology will have set of highly relevant and moderate classes and instances as well as few least relevant classes and instances that will eventually lead to exhaustive ontology for the particular domain.Practical implicationsIn this paper, a HSSM method is proposed and used to merge the academic social network ontologies; this is observed to be an extremely powerful methodology compared with other former studies. This HSSM approach can be applied for various domain ontologies and it may deliver a novel vision to the researchers.Originality/valueThe HSSM is not applied for merging the ontologies in any former studies up to the knowledge of authors.

Read full abstract

BackgroundBiological knowledge, and therefore Gene Ontology annotation sets, for human genes is incomplete. Recent studies have reported that biases in available GO annotations result in biased estimates of functional similarities of genes, but it is still unclear what the effect of incompleteness itself may be, even in the absence of bias. Pairwise gene similarities are used in a number of contexts, including gene “functional similarity” clustering and the related problem of functional ontology structure inference, but it is not known how different similarity measures or clustering methods perform on this task, and how the clusters are affected by annotation completeness.ResultsWe developed representations of both “complete” and “incomplete” GO annotation datasets based on experimentally-supported annotations from the GO database—specifically designed to model the incompleteness of human gene annotations—and computed semantic similarities for each set using a variety of different published measures. We then assessed the clusters derived from these measures using two different clustering methods: hierarchical clustering, and the CliXO algorithm. We find the CliXO algorithm, combined with appropriate measures, performs better than hierarchical clustering in reconstructing GO both when the data are complete, and incomplete. Some measures, particularly those that create a pairwise gene similarity by averaging over all pairwise annotation similarities, had consistently poor performance, and a few measures, such as Lin best-matched average and Relevance maximum, were generally among the best performers for a broad range in annotation completeness and types of GO classes. Finally, we show that for semantic similarity-based clustering, the multicellular organism process branch of the GO biological process ontology is more challenging to represent than the cellular process branch.ConclusionsWe assessed the effects of annotation completeness on the distribution of pairwise gene semantic similarity scores, and subsequent effects on the clusters derived from these scores. Our results suggest combinations of semantic similarity measures, gene-level scoring methods and clustering method that perform best for functional gene clustering using annotation sets of varying completeness. Overall, our results underscore the importance of increasing the completeness of GO annotations to for supporting computational analyses of gene function.

Read full abstract

Semantic Similarity Measures Research Articles

Related Topics

Articles published on Semantic Similarity Measures

A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art

Combining and learning word embedding with WordNet for semantic relatedness and similarity measurement

A word-embedding-based approach for accurate identification of corresponding activities

A novel method for merging academic social network ontologies using formal concept analysis and hybrid semantic similarity measure

Effective spatio-temporal semantic trajectory generation for similar pattern group identification

A New Hypred Improved Method for Measuring Concept Semantic Similarity in WordNet

Crime base: Towards building a knowledge base for crime entities and their relationships from online news papers

Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content

Toward semantic similarity measure between concepts in an ontology

Semantic similarity aggregators for very short textual expressions: a case study on landmarks and points of interest

Story co-segmentation of Chinese broadcast news using weakly-supervised semantic similarity

Multi-domain semantic similarity in biomedical research

English semantic feature production norms: An extended database of 4436 concepts.

Automatic design of semantic similarity controllers based on fuzzy logics

In silico markers: an evolutionary and statistical approach to select informative genes of human breast cancer subtypes.

A review on semantic similarity measures for ontology

Predicting disease-related phenotypes using an integrated phenotype similarity measurement based on HPO

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness

Semantic textual similarity between sentences using bilingual word semantics

Semantic similarity measures for formal concept analysis using linked data and WordNet

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Semantic Similarity Measures Research Articles

Related Topics

Articles published on Semantic Similarity Measures

A reproducible survey on word embeddings and ontology-based methods for word similarity: Linear combinations outperform the state of the art

Combining and learning word embedding with WordNet for semantic relatedness and similarity measurement

A word-embedding-based approach for accurate identification of corresponding activities

A novel method for merging academic social network ontologies using formal concept analysis and hybrid semantic similarity measure

Effective spatio-temporal semantic trajectory generation for similar pattern group identification

A New Hypred Improved Method for Measuring Concept Semantic Similarity in WordNet

Crime base: Towards building a knowledge base for crime entities and their relationships from online news papers

Leveraging synonymy and polysemy to improve semantic similarity assessments based on intrinsic information content

Toward semantic similarity measure between concepts in an ontology

Semantic similarity aggregators for very short textual expressions: a case study on landmarks and points of interest

Story co-segmentation of Chinese broadcast news using weakly-supervised semantic similarity

Multi-domain semantic similarity in biomedical research

English semantic feature production norms: An extended database of 4436 concepts.

Automatic design of semantic similarity controllers based on fuzzy logics

In silico markers: an evolutionary and statistical approach to select informative genes of human breast cancer subtypes.

A review on semantic similarity measures for ontology

Predicting disease-related phenotypes using an integrated phenotype similarity measurement based on HPO

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness

Semantic textual similarity between sentences using bilingual word semantics

Semantic similarity measures for formal concept analysis using linked data and WordNet