Secure training of decision trees with continuous attributes

Mark Abspoel,Daniel Escudero,Nikolaj Volgushev

doi:10.2478/popets-2021-0010

Abstract

Abstract We apply multiparty computation (MPC) techniques to show, given a database that is secret-shared among multiple mutually distrustful parties, how the parties may obliviously construct a decision tree based on the secret data. We consider data with continuous attributes (i.e., coming from a large domain), and develop a secure version of a learning algorithm similar to the C4.5 or CART algorithms. Previous MPC-based work only focused on decision tree learning with discrete attributes (De Hoogh et al. 2014). Our starting point is to apply an existing generic MPC protocol to a standard decision tree learning algorithm, which we then optimize in several ways. We exploit the fact that even if we allow the data to have continuous values, which a priori might require fixed or floating point representations, the output of the tree learning algorithm only depends on the relative ordering of the data. By obliviously sorting the data we reduce the number of comparisons needed per node to O(N log2 N) from the naive O(N 2), where N is the number of training records in the dataset, thus making the algorithm feasible for larger datasets. This does however introduce a problem when duplicate values occur in the dataset, but we manage to overcome this problem with a relatively cheap subprotocol. We show a procedure to convert a sorting network into a permutation network of smaller complexity, resulting in a round complexity of O(log N) per layer in the tree. We implement our algorithm in the MP-SPDZ framework and benchmark our implementation for both passive and active three-party computation using arithmetic modulo 264. We apply our implementation to a large scale medical dataset of ≈ 290 000 rows using random forests, and thus demonstrate practical feasibility of using MPC for privacy-preserving machine learning based on decision trees for large datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Proceedings on Privacy Enhancing Technologies	Publication Date: Nov 9, 2020
Citations: 31	License type: CC BY-NC-ND 3.0

R Discovery Prime

R Discovery Prime

Secure training of decision trees with continuous attributes

Abstract

Talk to us

Similar Papers

More From: Proceedings on Privacy Enhancing Technologies

Lead the way for us

Similar Papers

Attribute Selection Based on Constraint Gain and Depth Optimal for a Decision Tree.
Huaining Sun ... Xuegang Hu
Entropy | VOL. 21
Huaining Sun, et. al.Huaining Sun ... Xuegang Hu
19 Feb 2019
Entropy | VOL. 21

Evolving Fuzzy Min–Max Neural Network Based Decision Trees for Data Stream Classification
Zahra Mirzamomen ... Mohammad Reza Kangavari
Neural Processing Letters | VOL. 45
Zahra Mirzamomen, et. al.Zahra Mirzamomen ... Mohammad Reza Kangavari
08 Jun 2016
Neural Processing Letters | VOL. 45

Research on attribute interval optimization method for segmentation based SVM and the Decision Tree Learning
Huanghua ... Zhang Dexian
-
Huanghua, et. al. Huanghua ... Zhang Dexian
01 Jun 2010
01 Jun 2010

An enhanced knowledge representation for decision-tree based learning adaptive scheduling
Yeou-Ren Shiue ... Chao-Ton Su
International Journal of Computer Integrated Manufacturing | VOL. 16
Yeou-Ren Shiue, et. al.Yeou-Ren Shiue ... Chao-Ton Su
01 Jan 2003
International Journal of Computer Integrated Manufacturing | VOL. 16

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Secure training of decision trees with continuous attributes

Abstract

Talk to us

Similar Papers

More From: Proceedings on Privacy Enhancing Technologies