Content Classification Tasks with Data Preprocessing Manifestations

Mamoona Anam,Ruby Mishra,Ferdin Joe John Joseph,Dr Swaminathan Ramamurthy,Ali Alenezi,Dr Kantilal P Rane

doi:10.14704/web/v19i1/web19094

Abstract

Deep reinforcement learning has a major hurdle in terms of data efficiency. We solve this challenge by pretraining an encoder with unlabeled input, which is subsequently finetuned on a tiny quantity of task-specific input. We use a mixture of latent dynamics modelling and unsupervised goal-conditioned RL to encourage learning representations that capture various elements of the underlying MDP. Our approach significantly outperforms previous work combining offline representation pretraining with task-specific finetuning when limited to 100k steps of interaction on Atari games (equivalent to two hours of human experience) and compares favourably with other pretraining methods that require orders of magnitude more data. When paired with larger models and more diverse, task-aligned observational data, our methodology shows great promise, nearing human-level performance and data efficiency on Atari in the best-case scenario.

Full Text