Abstract
The rate at which movies are being produced is increasing at exponential rates and it has become pertinent to ascertain success rate since the investment that goes into these movie creation runs in millions of dollars. A number of data mining-based methods, ranging from Support Vector Machine (SVM) to logistic regression, have been proposed with a varying level of success with SVM showing the most promising results. This paper aims to carry out a comparative analysis of the performance of Gradient Boosting and SVM algorithms in optimizing the prediction of movie success. The study developed a framework for the research methodology; the dataset used contained 33 movie attributes and 838 entries. The dataset was cleaned with six attributes; features were identified and selected from the datasets using four methods. These methods include: Analysis of Variance (ANOVA), Lasso Regularization, Combination of Lasso Regularization and Random Forest (RF). Model Formulation were done using Support Vector Machine (SVM) and Gradient Boosting Algorithm and the performance evaluation of the developed predictive models was done using accuracy, precision and recall values. The results shows that the accuracy of the Gradient Boosting algorithm is around 100%, SVM-Linear is 86 %, SVM-Poly is 88%, SVM-RBF is 88% and SVM-Sigmoid is 72%. The study concluded that Gradient Boosting algorithm is more robust in predicting movie success. Also recommended that comparison should be done with different machine learning techniques.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.