Abstract

In recent years, weighted data is appearing more and more frequent in many applications, but the existence of anomalies decreases the accuracy of data-based operations, thus, it is necessary to detect anomalies to improve the data quality. However, the existing anomaly detection methods for weighted data only consider the Weighted Frequent Itemsets (WFIs) or Weighted Rare Itemsets (WRIs) separately, which causes their detection accuracy is seriously dependent on the preset minimal weighted support (min_wsup) value. To address these issues, we propose an anomaly detection method for weighted data on the basis of feature association analysis, namely ADWD, it accurately detects the anomalies under different min_wsup values through fully considering both WFIs and WRIs. ADWD first deletes infrequent 1-itemses during constructing Weighted Frequent Itemset-based Tree (WFI-Tree), thus decreasing time overhead on the inquiry of extensible itemsets; And then, ADWD defines three deviation metrics through comprehensively considering possible influencing factors to calculate transaction’s abnormal score. Finally, the transactions whose abnormal score in top-rank are judged as anomalies. Extensive experiments on three datasets verify that the proposed ADWD method can more accurately detect anomalies from weighted data within less time usage, as well as has good scalability.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call