A Novel Sanitization Approach for Privacy Preserving Utility Itemset Mining

R.R Rajalaxmi,A.M Natarajan

doi:10.5539/cis.v1n3p77

Abstract

Data mining plays a vital role in today’s information world wherein it has been widely applied in various business organizations. The current trend in business collaboration demands the need to share data or mined results to gain mutual benefit. However it has also raised a potential threat of revealing sensitive information when releasing data. Data sanitization is the process to conceal the sensitive itemsets present in the source database with appropriate modifications and release the modified database. The problem of finding an optimal solution for the sanitization process which minimizes the non-sensitive patterns lost is NP-hard. Recent researches in data sanitization approaches hide the sensitive itemsets by reducing the support of the itemsets which considers only the presence or absence of itemsets. However in real world scenario the transactions contain the purchased quantities of the items with their unit price. Hence it is essential to consider the utility of itemsets in the source database. In order to address this utility mining model was introduced to find high utility itemsets. In this paper, we focus primarily on protecting privacy in utility mining. Here we consider the utility of the itemsets and propose a novel approach for sanitization such that minimal changes are made to the database with minimum number of non-sensitive itemsets removed from the database.

Highlights

Background and Related Work2.1 Frequent pattern MiningLet I = {i1, i2, i3, ..., in} be a set of items
Since frequent itemset mining is a preliminary step in the association rule mining, most of the researches have addressed the privacy preservation of frequent itemsets with respect to association rule mining
We propose a novel approach called Conflict based Utility Itemset Sanitization (CUIS) that strategically modifies the database to decrease the utility of the sensitive itemsets

Summary

Frequent pattern Mining

Let D, the task-relevant data, be a set of database transactions where each transaction T is a set of items such thatT ⊆ I. Each transaction is associated with an identifier, called TID. A transaction T is said to contain A if and only if A ⊆ T. A set of items is referred to as an itemset. An itemset that contains k items is a k-itemset. The occurrence frequency or support of an itemset is the number of transactions that contain the itemset If the relative support of an itemset I satisfies a prespecified minimum support threshold, I is a frequent itemset

Privacy preservation of frequent itemsets

Utility Mining

Problem Formulation

Conflict based Sanitization Approach

Find the difference of the sensitive itemset p as

Experimental Analysis

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Computer and Information Science	Publication Date: Jul 18, 2008
Citations: 6	License type: cc-by

R Discovery Prime

R Discovery Prime

A Novel Sanitization Approach for Privacy Preserving Utility Itemset Mining

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer and Information Science

Lead the way for us

Similar Papers

An efficient algorithm to mine high average-utility itemsets
Jerry Chun-Wei Lin ... Miroslav Voznak
Advanced Engineering Informatics | VOL. 30
Jerry Chun-Wei Lin, et. al.Jerry Chun-Wei Lin ... Miroslav Voznak
01 Apr 2016
Advanced Engineering Informatics | VOL. 30

CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach
Alva Erwin ... N.R. Achuthan
-
Alva Erwin, et. al.Alva Erwin ... N.R. Achuthan
01 Oct 2007
01 Oct 2007

Data sanitization in association rule mining based on impact factor
...
Journal of Artificial Intelligence and Data Mining | VOL. 3
, et. al. ...
01 Jan 2015
Journal of Artificial Intelligence and Data Mining | VOL. 3

Improved Market Basket Analysis with Utility Mining
Nishant Srivastava ... Kanika Gupta
SSRN Electronic Journal | VOL. -
Nishant Srivastava, et. al.Nishant Srivastava ... Kanika Gupta
09 May 2018
SSRN Electronic Journal | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Novel Sanitization Approach for Privacy Preserving Utility Itemset Mining

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computer and Information Science