Abstract

Tree-based models are wildly adopted in various real-world scenarios. Recently, there is a growing interest in vertical federated tree-based model learning to build tree-based models by exploiting data from multiple organizations without violating data privacy regulations. However, most existing work focuses on batch learning settings where all training samples are prepared at once. They cannot be applied to scenarios where local samples come in a streaming manner. Additionally, the present federated learning algorithms suffer from inference attacks. In this paper, we present a novel solution that enables different organizations to jointly train a tree-based model in an incremental and privacy-preserving manner. Our solution is based on Very Fast Decision Tree (VFDT) for incrementally building a tree model. Since data statistics exchanged in the training process may implicitly disclose private information, we propose a protection mechanism based on order-preserving encryption. To further improve the efficiency of the solution, we compress the size of statistics by means of regional counting, which not only maintains model accuracy but also enhances privacy. We conduct extensive experiments on various real-world datasets and the results show the superiority of our solution in terms of both efficiency and privacy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call