Abstract

Big data analysis has become an essential tool in a lot of fields. An increasing number of entities rely on different kinds of data analysis tools to formulate their strategy. However, the popularity of big data brings several problems as well because attackers might pollute the data set by adding negligible data points to make a negative effect on the final analysis results. Therefore, in this paper, we propose to leverage the energy-based learning method to detect outliers within a data set. Specifically, we iteratively rule out bad data points from the data set based on specific selection rules. The experiment result is promising, which shows that our algorithm can improve the accuracy in the linear regression by more than 20% on average.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call