Abstract

In Reinforcement Learning (RL) an agent interacts with the environment based on sequential decisions. This agent receives a reward from the environment according to decisions and tries to maximize the reward. RL is used in several domains such as production, autonomous driving, business management, education, games, healthcare, natural language processing and robotics, among others. RL methodologies require processing large volumes of data and computational power. To speed up these applications, field-programmable gate array (FPGA) are widely employed in the literature. This paper proposes an accelerator for the Markov Decision Process (MDP) implemented in the AI-Toolbox public library using high-level synthesis tools, using the tiger-antelope problem as use case. Our approach shows an acceleration greater than 7x compared to the original version.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call