Abstract

In this article, we consider a variant of the classical DeGroot model of opinion propagation with random interactions, in which a prescribed subset of agents is amenable to a control parameter. There are also some stubborn agents and some agents that are neither stubborn nor amenable to control. We map the problem to a shortest path problem, where the control parameter is coupled across controlled nodes because of a common resource constraint. Hence, the problem is not amenable to a pure dynamic programming approach, and the classical reinforcement learning schemes for the latter cannot be applied here for maximizing average influence in the long run. We view it instead as a parametric optimization problem and not a control problem and use a nonclassical policy gradient scheme. We analyze its performance theoretically and through numerical experiments. We also consider a situation when only certain interactions between agents are observed.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call