Abstract

Feature selection has become a mandatory task in data mining, due to the overwhelming amount of features in Big Data problems. To handle this high-dimensional data and avoid the well-known curse of dimensionality, we need to pre-select an optimal subset of features to reduce redundant computations. Federated learning is a machine learning technique based on training an algorithm over many decentralized edge devices holding local rather than global data on a centralized server. Application of this technique is extending to fields such as self-driving cars, medicine and health, and Industry 4.0, where data privacy is compulsory. Feature selection through federated learning is a complicated task since suboptimal features calculated by feature selection methods may be different in heterogeneous datasets from different nodes. In this paper, we propose a lossless federated version of the classic minimum redundancy maximum relevance (mRMR) feature selection algorithm, called federated mRMR (fed-mRMR), which, without losing any effectiveness of the original mRMR method, is applicable to federated learning approaches and capable of dealing with data that are not independent and identically distributed (non-IID data).Implementation can be found at: https://github.com/jorgehermo9/fed-mrmr

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.