Abstract

High average-utility itemset mining consists of analyzing a quantitative customer transactional database to identify high average-utility itemsets (HAUIs), that is sets of items that have a high average utility (e.g. profit). Although important information about customers’ habits can be revealed by HAUIs, they can expose sensitive information. To address this concern, the problem of hiding frequent HAUIs (FHAUIs) is studied, which is to modify a transaction database to ensure that sensitive FHAUIs cannot be discovered. An algorithm is designed, named H-FHAUI, which relies on an extended border approach based on the support (occurrence frequency) and the average-utility of itemsets. Moreover, to hide all FHAUIs, H-FHAUI utilizes a novel extended lower border named BdE- based on weak upper bounds on the average-utility to only hide a small number of FHAUIs. Then, all remaining FHAUIs are also hidden. Besides, H-FHAUI utilizes a novel weight-based strategy named ICS to choose items and transactions to be modified, a novel TIU-VIU structure to quickly update weak upper bounds, and a strategy named DUSWUB to quickly hide FHAUIs while ensuring that the total utility of D is preserved as much as possible. Experimental results show that H-FHAUI outperforms a baseline in terms of runtime, memory usage, and quality of the sanitized database.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call