Abstract
In this article, we consider a variant of the classical DeGroot model of opinion propagation with random interactions, in which a prescribed subset of agents is amenable to a control parameter. There are also some stubborn agents and some agents that are neither stubborn nor amenable to control. We map the problem to a shortest path problem, where the control parameter is coupled across controlled nodes because of a common resource constraint. Hence, the problem is not amenable to a pure dynamic programming approach, and the classical reinforcement learning schemes for the latter cannot be applied here for maximizing average influence in the long run. We view it instead as a parametric optimization problem and not a control problem and use a nonclassical policy gradient scheme. We analyze its performance theoretically and through numerical experiments. We also consider a situation when only certain interactions between agents are observed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.