Abstract
Multiagent systems—such as recommendation systems, ride-sharing platforms, food-delivery systems, and data-routing centers—are areas of rapid technology development that require constant improvements to address the lack of efficiency and curse of dimensionality. In the paper “Dynamic Programming Principles for Mean-Field Controls with Learning,” we show that multiagent systems with mean-field approximation and learning can be recast as general forms of reinforcement learning problems, where the state variable is replaced by the probability distribution. This reformulation paves the way for developing efficient value-based and policy-based algorithms for mean-field controls with learning. It is also the first step toward future theoretical development of learning problem with mean-field controls.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.