Abstract

Deep phenotyping is defined as learning about genotype-phenotype associations and the history of human illness by analyzing phenotypic anomalies. It is significant to investigate the association between phenotype and genotype. Machine learning approaches are good at predicting the associations between abnormal human phenotypes and genes. A novel framework based on machine learning is proposed to estimate the links between human phenotype ontology (HPO) and genes. The Orphanet’s annotation parses the human phenotype-gene associations. An algorithm node2vec generates the embeddings for the nodes (HPO and genes). It performs node sampling on the graph using random walks and learns features on these sampled nodes for embedding. These embeddings were used downstream to predict the link between these nodes by supervised classifiers. Results show the gradient boosting decision tree model (LightGBM) has achieved an optimal AUROC of 0.904 and an AUCPR of 0.784, an optimal weighted F1 score of 0.87. LightGBM can detect more accurate interactions and links between human phenotypes and gene pairs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call