As the information age develops, the scale and form of data become more diverse and diverse. Therefore, people need to use effective means to process information. For large-scale data mining problems, a clustering-based kernel matrix inner product filtering method is introduced to decompose the original quadratic programming problem into multiple sub-problems to support parallel training. And a Spark-based multiple submodels parallel support vector machine is proposed. By introducing open-source tools such as OpenCV, image feature extraction can be performed on large-scale video data. Finally, combined with the designed parallel support vector machine algorithm, video facial and expression recognition is carried out. These experiments confirmed that the research method achieved a maximum acceleration ratio of 2090 times when processing Covtype datasets. The research model could achieve an accuracy of over 99 %. Under the maximum data scale experiment, the research model improved prediction accuracy by 21 percentage points with an acceptable additional time cost of about 4 min only. Task parallel processing could more fully utilize cluster performance, increasing by approximately 3.5 m/s2 from 30 to 150 cores. The research model had the highest recognition accuracy for facial expressions, further demonstrating the effectiveness and superiority of this method. The research method has improved the efficiency of big data analysis and mining, and is of great significance in parallel analysis of video data.
Read full abstract