This paper explores the usage of Lisp in a small modern Reinforcement Learning (RL) project. The Lisp dialect, Hy programming language, is used to incorporate the traditional libraries and packages in up-to-date workflows. This project is centered around the usage of NetHack for RL. The MiniHack sandbox framework and NetHack Learning Environment (NLE) are used to create custom training/testing environments and tasks. The MiniHack sandbox framework creates a simple level editor and creation interface for use in the training and evaluation process of the agent. NLE is chosen as the working environment. For the agent model, this project adopts Torchbeast’s PolyBeast, a PyTorch implementation of the IMPALA architecture. The usage of Hy within this project is forefront, and so it is implemented as much as possible to accomplish the tasks.
Read full abstract