Abstract

BackgroundKnowledge graphs can represent the contents of biomedical literature and databases as subject-predicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases. Some diseases are often diagnosed in patients in specific temporal sequences, which are referred to as disease trajectories. Here, we determine whether a sequence of two diseases forms a trajectory by leveraging the predicate information from paths between (disease) proteins in a knowledge graph. Furthermore, we determine the added value of directional information of predicates for this task. To do so, we create four feature sets, based on two methods for representing indirect paths, and both with and without directional information of predicates (i.e., which protein is considered subject and which object). The added value of the directional information of predicates is quantified by comparing the classification performance of the feature sets that include or exclude it.ResultsOur method achieved a maximum area under the ROC curve of 89.8% and 74.5% when evaluated with two different reference sets. Use of directional information of predicates significantly improved performance by 6.5 and 2.0 percentage points respectively.ConclusionsOur work demonstrates that predicates between proteins can be used to identify disease trajectories. Using the directional information of predicates significantly improved performance over not using this information.

Highlights

  • Knowledge graphs can represent the contents of biomedical literature and databases as subjectpredicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases

  • (2020) 11:9 we have recently shown that analyses that are performed on protein knowledge graphs benefit from predicate information [13]

  • Extracted paths In total, 6859 distinct disease proteins were assigned to the diseases in both reference sets, three of which could not be mapped to the Euretos Knowledge Platform (EKP)

Read more

Summary

Introduction

Knowledge graphs can represent the contents of biomedical literature and databases as subjectpredicate-object triples, thereby enabling comprehensive analyses that identify e.g. relationships between diseases. We determine whether a sequence of two diseases forms a trajectory by leveraging the predicate information from paths between (disease) proteins in a knowledge graph. We create four feature sets, based on two methods for representing indirect paths, and both with and without directional information of predicates (i.e., which protein is considered subject and which object). Much research has been performed with knowledge graphs that only consist of proteins, commonly referred to as protein-protein interaction networks. Vlietstra et al Journal of Biomedical Semantics (2020) 11:9 we have recently shown that analyses that are performed on protein knowledge graphs benefit from predicate information [13]

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.