Abstract
This paper addresses the distributed online bandit linear regression problems with privacy protection, in which the training data are spread in a multi-agent network. Each node identifies a linear predictor to fit the training data and experiences a square loss on each round. The purpose is to minimize the regret that assesses the difference of the accumulated loss between the online linear predictor and the optimal offline linear predictor. Moreover, the differential privacy strategy is adopted to prevent the adversary from inferring the parameter vector of any node. Two efficient differentially private distributed online regression algorithms are developed in the cases of one-point and two-point bandit feedback. Our analysis suggests that the developed algorithms achieve ϵ-differential privacy and establish the regret upper bounds in O(K3/4) and O(K) for one-point and two-point bandit feedback, respectively, where K is the time horizon. We also show that there exists a tradeoff between our algorithms’ privacy level and convergence. Finally, the performance of the proposed algorithms is validated by a numerical example.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.