Abstract
This article provides a study into the social network where influential personalities collaborate positively among themselves to learn an underlying truth over time, but may have misled their followers to believe a false information. Most existing work that study leader-follower relationships in a social network model the social network as a graph network, and apply non-Bayesian learning to train the weakly connected agents to learn the truth. Although this approach is popular, it has the limitation of assuming that the truth - otherwise called the true state - is time-invariant. This is not practical in social network, where streams of information are released and updated every second, making the true state arbitrarily time-varying. Thus, this article improves on existing work by introducing online reinforcement learning into the graph theoretic framework. Specifically, multi-armed bandit technique is applied. A multi-armed bandit algorithm is proposed and used to train the weakly connected agents to converge to the most stable state over time. The speed of convergence for these weakly connected agents trained with the proposed algorithm is slower by 66% on average, when compared to the speed of convergence for strongly connected agents trained with the state-of-the-art algorithm. This is because weakly connected agents are difficult to train. However, the speed of convergence of these weakly connected agents can be improved by approximately 50% on average, by fine-tuning the learning rate of the proposed algorithm. The sublinearity of the regret bound for the proposed algorithm is compared to the sublinearity of the regret bound for the state-of-the-art algorithm for strongly connected networks.
Highlights
T HE social network has grown over the years to become a platform for influential personalities to sell their beliefs to followers within their sphere of influence
It is of importance to show that the weakly connected agents can converge to the most stable state, albeit, at a slower rate compared to the strongly connected agents
A non-stochastic multi-armed bandit algorithm is proposed, and it is shown by simulation that the beliefs of weakly connected agents can converge to this most stable state
Summary
T HE social network has grown over the years to become a platform for influential personalities to sell their beliefs to followers within their sphere of influence. Most existing literature that applied graph theory to study truth-learning in the social network assumed that the true state is time-invariant [11], [13], [14], [22]. Online learning has shown to perform well in predicting the time-varying true state for strongly connected agents unlike conventional social learning methods that fail. [27] proposed an online learning approach that can help strongly connected agents predict the time-varying true state. The multi-armed bandit technique has proven to be quite effective for training strongly connected agents to learn the truth when the true state is arbitrarily time-varying [27], [30], it is yet to be applied for weakly connected agents. A preliminary study of this work can be found in [31]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.