Abstract
In the article, we considered recommender models based on matrix factorization demonstrate excellent performance in collaborative filtering. The standard Matrix Factorization approach in MLlib deals with clear ratings. To work with implicit data, we used the trainImplicit method. To simulate the processing of real-time data streams, we used the Spark Streaming library, which is responsible for receiving data from the input source and converting the raw data into a discretized stream discretized stream (DStream) consisting of Spark RDD. The rank parameter determines the number of hidden features in the low rank approximation matrices. As a rule, the greater the number of factors, the better, but for a large number of users or elements, it will directly affect the memory usage of the computing system and the amount of data required for training. Therefore, in our problem it was a compromise solution.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.