Abstract

The field of Bio-Medicine is bustling with new discoveries made every day which are well documented into Bio medical literature. Diabetes is one of the most common chronic diseases affecting people round the globe. It is known that diabetes is a hereditary disease due to gene mutations. Biological experiments like Genome Wide Association Studies (GWAS) have been carried out to identify gene mutations responsible for a disease. However, they do not provide information about the strength of association of these genes to other genes. In this work, to enable knowledge extraction we identified entities of interest by designing a domain specific Named Entity Recognition (NER) and attempted to find the most suitable model for NER for Bio medical domain and then proceeded to identification of similarity among genes. We identified the strength of gene — gene interactions that cause diabetes and found new potential causal genes for different types of diabetes by means of matrix construction and found the degree of similarity among genes with a newly proposed Deviation from mean similarity finding algorithm. In order to find a suitable NER model we used three classifiers namely Conditional Random Field (CRF), Support Vector Machine and Naive Bayes. We followed a supervised machine learning to train the models and measured the performance of each model with F-Measure and achieved a higher accuracy in diabetic prediction.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.