Abstract

The objective of this project is to apply a binary classification model based on the Logistic Regression algorithm for the analysis of scientific articles. To begin with, information was collected from various scientific repositories, such as the Institute of Electrical and Electronics Engineers Inc, American Society of Civil Engineers (ASCE), among others. A spreadsheet was used to carry out this compilation. For the execution of the project, a data filtering process was carried out manually in the first instance, in order to eliminate files that did not allow access to the repositories and other problems that could affect the development of the project. Based on the information collected, a process of labeling the data in different columns using values of 0 and 1 was carried out. This procedure allowed the creation of variables to carry out a binary classification that suited the requirements of the project. After this stage, Python was selected as the programming language and the Google Collab tool was used to work more efficiently as a team. Various preprocessing techniques were applied to refine the information and prepare it for later use. Once these changes were made, the variables to be used in the analysis were selected and all the necessary requirements were established to obtain a favorable analysis. Finally, when the results were obtained, it was found that the model used was adequate for working with this information, obtaining very satisfactory data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call