Abstract

The development of data mining with data protection and data utility can manage distributed data efficiently. This paper revisits the concepts and techniques of privacy-preserving Random Decision Tree (RDT). In existing systems, cryptography-based techniques are effective at managing distributed information. Privacy-preserving RDT handles distributed information efficiently. Privacy-preserving RDT gives better precision data mining while preserving information and reducing the calculation time. This paper deals with this headway in privacy-preserving data mining technology utilizing emphasized approach of RDT. RDT gives preferable productivity and information privacy than cryptographic technique. Various data mining tasks utilize RDT, like classification, relapse, ranking, and different classifications. Privacy-preserving RDT utilizes both randomization and the cryptographic method, giving information privacy for some decision tree-based learning tasks; this is an effective technique for data mining with privacy-preserving distributed information. Thus, in horizontal partitioning of the dataset, parties gather information for various entities but have data for all attributes. On the other hand, various associations may gather different data about a similar set of people. Thus, in vertically partitioned data, all parties gather data for the same collection of items. In all of these cases, both horizontal and vertical partitioning of datasets is somewhat inaccurate.

Highlights

  • Data Mining finds exciting data patterns, and insights from extensive databases

  • Information privacy for various associations is paramount to expand their business since almost all organizations must share data without compromising privacy

  • This paper looks at randomization and cryptographic methods applied to sensitive information

Read more

Summary

Introduction

There are two phases in privacy-preserving data mining, the first is information collection, and the second is information publishing. Define a tree by randomly selecting a feature without utilizing any training information. RDT gives a better answer for the distributed data mining in concepts of privacypreserving because of these reasons; random formation of the tree gives more security because to get prior information, one should find the entire classification model and cases. Its proficiency is its ability to maintain privacy and accuracy yet lessen computation time compared to existing algorithms. It uses an Iterative Dichotomiser 3 (ID3) and Boosting algorithm within an RDT, including a privacy-preserving algorithm. Classification can be defined as storing information with similar features in the same class

Random Decision Trees Definition
Random Decision Trees Architecture
Privacy-Preserving Data Mining
Vertically Partition Data
Horizontally Partition Data
Privacy-preserving Random Decision Tree algorithm
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call