Information Extraction of Compound-Protein Interaction from Scientific Paper using Machine Learning

Aulia Afriza,Muhammas Rheza Muztahid,Wisnu Ananta Kusuma,- Annisa Annisa

doi:10.18517/ijaseit.12.2.13748

Aulia Afriza, Muhammas Rheza Muztahid + Show 2 more

Open Access

https://doi.org/10.18517/ijaseit.12.2.13748

Copy DOI

Abstract

Drug Target Interaction (DTI) is an important process in drug discovery that aims to identify useful compounds in treatment. DTI research is mostly found in databases and literature or papers. To obtain DTI information, another method such as information extraction is required to retrieve information related to DTI interactions. The information in the abstract of the research paper contains many compound sentences. This study performs regular expressions to identify compound sentences, text mining for information extraction, and classification using Bernoulli Naive Bayes. The research uses a collection of abstract documents, where 3.000 abstract documents will be arranged into 29.363 sentences. Sentences that the regular expression has parsed are matched using pattern matching and conducted by text pre-processing. Sentences resulting from text pre-processing stages are used as training datasets. We use 10- fold cross-validation to evaluate the model. This research obtained the best average accuracy value of 0.72 for using naive Bayes without regular expression for compound sentences and 0.76 accuracies for naive Bayes with a regular expression for single sentences. Furthermore, by applying the feature selection process for compound sentence data, we obtained an accuracy of 0.731 for the model without regular expressions and an accuracy of 0.7644 for the model with feature selection using regular expressions.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal on Advanced Science, Engineering and Information Technology	Publication Date: Apr 7, 2022
Citations: 1	License type: cc-by-sa

R Discovery Prime

R Discovery Prime

Information Extraction of Compound-Protein Interaction from Scientific Paper using Machine Learning

Abstract

Talk to us

Similar Papers

More From: International Journal on Advanced Science, Engineering and Information Technology

Lead the way for us

Similar Papers

Identification of drug-target interactions via multiple information integration
Yijie Ding ... Fei Guo
Information Sciences | VOL. 418-419
Yijie Ding, et. al.Yijie Ding ... Fei Guo
12 Aug 2017
Information Sciences | VOL. 418-419

P.523 Use of psychoactive drugs in paediatric population in Catalonia
M Fradera ... X Goldberg
European Neuropsychopharmacology | VOL. 29
M Fradera, et. al.M Fradera ... X Goldberg
01 Dec 2019
European Neuropsychopharmacology | VOL. 29

Ensemble Learning Models for Drug Target Interaction Prediction
Fahmida Minna K ... Maya Mohan
-
Fahmida Minna K, et. al.Fahmida Minna K ... Maya Mohan
09 May 2022
09 May 2022

A Drug-Side Effect Context-Sensitive Network approach for drug target prediction.
Mengshi Zhou ... Yang Chen
Bioinformatics (Oxford, England) | VOL. 35
Mengshi Zhou, et. al.Mengshi Zhou ... Yang Chen
14 Nov 2018
Bioinformatics (Oxford, England) | VOL. 35

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Information Extraction of Compound-Protein Interaction from Scientific Paper using Machine Learning

Abstract

Talk to us

Similar Papers

More From: International Journal on Advanced Science, Engineering and Information Technology