Building a content-based recommendation engine model using Adamic Adar Measure; A Netflix case study

Abdullah Havolli,Lorik Fetahu,Arianit Maraj

doi:10.1109/meco55406.2022.9797139

Abstract

The volume of multimedia data on the Internet is constantly increasing. Nowadays, the most challenging task for Internet users is to extract and use these data smoothly and efficiently. The use of data more efficiently is done using the recommendation system to filter information systems from large volumes of data extracting relevant information. This paper uses a text similarity technique to identify similar content using text-based features. This paper aims to build a content-based recommendation model using Adamic Adar Measure. The content-based recommendation model works based on the similarity of movies' content. The focus of this paper will be on the text-based features on which the model is filtered. The text vectorization using TfIdfV ectorizer to detect similarities between movies will be performed. Then, it is used the cosine similarity from the scikit-learn package to calculate the numeric value representing the similarity between the two movies. To compute the closeness of nodes based on their shared neighbors will be used the Adamic Adar measure. Instead of just using the similarity of movies based only on one field, the graphic form will be used to see the similarity of movies based on many fields. Our recommendation model analyzes the content of the movies (actors, directors, countries, and categories) to find out other movies with similar content. This model will rank similar movies according to their similarity scores and will recommend the most relevant movies to the user.

Full Text