Abstract

E-prints UMSIDA is a repository of student and lecturer publication documents at the Muhammadiyah University of Sidoarjo (UMSIDA). The collection of documents is still random and the search can only detect from the title keywords. The increasing culture of writing and research makes it possible for more and more documents as literature. Documents in E-prints are grouped by subject provided by the repository manager and grouped by the admin who uploaded the document. Automatic document grouping can be done by grouping documents based on the contents of the document using the Information Retrieval (IR) approach. The retrieval process is carried out by document processing with tokenisation to obtain data tokens, the data tokens are processed through a stemming process to obtain the stem value of each word. The stem value is processed using the indexing process and word stem to get sentence indexes through the weighting process. The index results stored in the database become document variables that are the features or characteristics of each document. The index of all documents is grouped through Automatic Clustering technique using K-Means Genetic Algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call