Abstract

Today, consumers are faced with an abundance of information on the internet; accordingly, it is hard for them to reach the vital information they need. One of the reasonable solutions in modern society is implementing information filtering. Some researchers implemented a recommender system as filtering to increase customers’ experience in social media and e-commerce. This research focuses on the combination of two methods in the recommender system, that is, demographic and content-based filtering, commonly it is called hybrid filtering. In this research, item products are collected using the data crawling method from the big three e-commerce in Indonesia (Shopee, Tokopedia, and Bukalapak). This experiment has been implemented in the web application using the Flask framework to generate products’ recommended items. This research employs the IMDb weight rating formula to get the best score lists and TF-IDF with Cosine similarity to create the similarity between products to produce related items.

Highlights

  • Today, consumers are faced with an abundance of information on the internet; it is hard for them to reach the vital information they need

  • that provides requests Frequency (TF)-Inverse document frequency (IDF) will carry out a weighting process for each item to get important values or keywords in a set of words [20]

  • Database (IMDb), a data and information provider company linked to films, television shows, video games, As stated earlier, tf is the frequency of terms (x) in a etc., including cast data, biographies, and production document (y), N is the total number of existing crews [17]

Read more

Summary

Research Method

The initial of this study is collecting product data using the crawling method. The step is pre-processing data, and there are two phases. Cleaning the data and equating the data type. Text processing, such as tokenizing process, case folding, and removing punctuation and stopwords. This research evolves the recommender system model for forming the Figure 2. This research evaluates the web In the preprocessing stage, the results of csv file application to ensure that the result of recommender extraction are cleaned to keep the data consistent and items is equal to the model—the diagram proses as there are no conflicts during the modeling process. This study, data cleaning was carried out before or at the same time as the process of inputting or importing data into the database. The data cleaning process includes several things, such as updating the format or data type of the dataset and adding a default value in the missing data in the dataset

Data Crawling
Demographic filtering
Cosine Similarity
TF-IDF Weighting
Recommender system model in the web
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call