Based on the parallel K-means algorithm, this article conducts in-depth research on the related issues of marketing node detection under the Internet, including designing a new Internet marketing node detector and a location summary network based on FCN (Full Convolutional Network) to input the preprocessing of the node and verify its performance under the data sets. At the same time, to solve the problem of insufficient data sets of Internet marketing nodes, the Internet data sets are artificially generated and used for detector training. First, the multiclass K-means algorithm is changed to two categories suitable for Internet marketing node detection: marketing nodes and background categories. Secondly, the weights in the K-means algorithm are mostly only applicable to target detection tasks. Therefore, when processing Internet marketing node detection tasks, the K-means algorithm is used to regress the training set and calculate 5 weights. During the simulation experiment, the weight calculation formula is used to calculate the weight of the feature term. The basic idea is that if a feature word appears more often in this document but less frequently in other nodes, the word will be assigned higher. At the same time, this article focuses on k. Some shortcomings of the mean clustering algorithm have been specifically improved. By standardizing the data participating in the clustering, the data participating in the clustering is transformed from an irregular distribution to a cluster-like distribution, thereby facilitating the clustering process. The density is introduced to determine the initial center of the cluster, and the purity metric is introduced to determine the appropriate density radius of the cluster center, to achieve the most effective reduction of the support vector machine training samples.
Read full abstract