Abstract

As an important research topic, high-utility itemset mining (HUIM) has, of late years, attracted increasing attention, where both the significance and quantity factors of items are taken into account to mine high-utility itemsets (HUIs). Privacy breaches have always been a major issue existing in the field of data mining, which usually inevitably arise, especially when private data collections are publicly published or shared by organizations. To tackle this problem, plentiful methodologies regarding privacy-preserving data mining (PPDM) have been proposed. Due to the high practicality of HUIM, in recent years, privacy-preserving utility mining (PPUM) has become a popular research orientation in PPDM. The main goal of PPUM is to hide sensitive HUIs (SHUIs) so as to leave no confidential information uncovered in the resulting sanitized database. However, all the previously proposed approaches have suffered from the defect of introducing numerous side effects by performing database perturbation. To alleviate this issue, in this paper, a novel algorithm based on integer linear programming (ILP) is proposed to obtain a lower ratio of side effects produced in the hiding process while does not reveal any sensitive information in the sanitized database. We formulate the hiding process as a constraint satisfaction problem (CSP), which pursuing the protection of SHUIs as well as the minimization of side effects. A solution to the hiding problem is expected to be obtained by exploiting ILP technique to solve the mapped problem, which properly indicates the processing manner of perturbation operation. In addition, a relaxation procedure is also adopted in the designed algorithm to provide an approximate solution of the CSP when the optimal one does not exist. Extensive experimental evaluations between our proposed method and other state-of-the-art algorithms are conducted on several real-world datasets. The comparative results demonstrate the superiorities of the proposed algorithm with respect to running time and the ability to minimize side effects.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call