Abstract

There is a need in the Universitas Kristen Satya Wacana (UKSW) to identify the research competence of their faculties at a study program and University level. To accomplish this requirement, we need to automate the analysis of research output and publications quickly. Research articles are scattered in many publisher systems and journals which may be reputable, unreputable, accredited, and unaccredited. We devised a computer code to quickly and efficiently retrieve publication titles recorded in Google Scholar using a machine learning algorithm. The result display is in the form of a word cloud so that dominant and frequent words will be prominent in the visualization. In determining scientific terms to display, we used a modified version of the word cloud Python module and unmodified Term Frequency - Inverse Document Frequency (TF-IDF) library. The algorithm was tested on publication titles of our study program in UKSW and confirmed directly. The system features the ability to produce a word cloud visualization for an individual faculty, for faculties in a study program, or in the University as a whole. We have not differentiated publication sources, whether they are reputable or unreputable, which might affect the accuracy of competence identification.

Highlights

  • Efforts to collect data on research results at a university often experience difficulties because the data is not documented in an integrated manner

  • It is hoped that in further research, weighting can be carried out on article titles with a higher reputation than journals with lower categories so that the results of the word cloud can create a dominant visualization for article titles in reputable articles. It was demonstrated about making a word cloud for Universitas Kristen Satya Wacana (UKSW) lecturer research data based on the data on lecturer article titles that were documented on Google Scholar

  • This was done because efforts to collect data on UKSW research results often encountered difficulties because the data were not documented in an integrated manner

Read more

Summary

Introduction

Efforts to collect data on research results at a university often experience difficulties because the data is not documented in an integrated manner. Classification techniques are compared to study the accuracy of several classification techniques in data classification in the form of text but it is still not easy to read visually. For this reason, in this study, the Word cloud was used to be able to provide classification results more where this method has been done by other authors in analyzing text using Latent Dirichlet allocation [2]. The Word Cloud algorithm was studied to collect spam and non-spam emails [9]. Incoming email could be classified as spam and not spam Such a working approach is called ‘learning by reminder’. Such a system can correctly predict labeling in invisible emails

Methods
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call