Feature Selection for Classification under Anonymity Constraint

Baichuan Zhang ,Noman Mohammed ,Vachik S Dave ,Mohammad Al Hasan

doi:10.5555/3121409.3121410

Abstract

Over the last decade, proliferation of various online platforms and their increasing adoption by billions of users have heightened the privacy risk of a user enormously. In fact, security researchers have shown that sparse microdata containing information about online activities of a user although anonymous, can still be used to disclose the identity of the user by cross-referencing the data with other data sources. To preserve the privacy of a user, in existing works several methods (k-anonymity, l-diversity, differential privacy) are proposed for ensuring that a dataset bears small identity disclosure risk. However, the majority of these methods modify the data in isolation, without considering their utility in subsequent knowledge discovery tasks, which makes these datasets less informative. In this work, we consider labeled data that are generally used for classification, and propose two methods for feature selection considering two goals: first, on the reduced feature set the data has small disclosure risk, and second, the utility of the data is preserved for performing a classification task. Experimental results on various real-world datasets show that the method is effective and useful in practice.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Feature Selection for Classification under Anonymity Constraint

Abstract

Talk to us

Similar Papers

More From: Transactions on Data Privacy

Lead the way for us

Journal: Transactions on Data Privacy	Publication Date: Apr 1, 2017
Citations: 3

Similar Papers

Informative Feature Clustering and Selection for Gene Expression Data
Yuqi Yang ... Zhihang Luo
IEEE Access | VOL. 7
Yuqi Yang, et. al.Yuqi Yang ... Zhihang Luo
01 Jan 2019
IEEE Access | VOL. 7

Research on Feature Selection and kNN Classification Method in Chinese Text Classification
Chao Xiao ... Ping Wu
-
Chao Xiao, et. al.Chao Xiao ... Ping Wu
01 Jan 2015
01 Jan 2015

Differentiation of fat-poor angiomyolipoma from clear cell renal cell carcinoma in contrast-enhanced MDCT images using quantitative feature classification.
Han Sang Lee ... Helen Hong
Medical Physics | VOL. 44
Han Sang Lee, et. al.Han Sang Lee ... Helen Hong
09 Jun 2017
Medical Physics | VOL. 44

A Tri-Objective Method for Bi-Objective Feature Selection in Classification.
Ruwang Jiao ... Mengjie Zhang
Evolutionary computation | VOL. 32
Ruwang Jiao, et. al.Ruwang Jiao ... Mengjie Zhang
03 Sep 2024
Evolutionary computation | VOL. 32

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Feature Selection for Classification under Anonymity Constraint

Abstract

Talk to us

Similar Papers

More From: Transactions on Data Privacy