Abstract

<p>Traditional feature selection methods are only concerned with high relevance between selected features and classes and low redundancy among features, ignoring their interrelations which partly weak classification performance. This paper developed a dynamic relevance strategy to measure the dependency among them, where the relevance of each candidate feature is updated dynamically when a new feature is selected. Protecting sensitive information has become an important issue when executing feature selection. However, existing differentially private machine learning algorithms have seldom considered the impact of data correlation, which may cause more privacy leakage than expected. Therefore, the paper proposed a differentially private feature selection based on dynamic relevance measure, namely DPFSDR. Firstly, as a correlation analysis technique, the weighted undirected graph model is constructed via the correlated degree, which can reduce the dataset’s dimension and correlated sensitivity. Secondly, as a feature selection criterion, F-score with differential privacy is adopted to measure the feature importance of each feature. Finally, to evaluate the effectiveness of feature selection, differentially private SVM combined with dynamic relevance measure is utilized to choose features. Experimental results show that the proposed DPFSDR algorithm can effectively obtain the optimal feature subset, and improve data utility while preserving data privacy.</p> <p> </p>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call