In this letter, we propose a low power deep reinforcement learning (DRL) SoC, supporting CNN and learning-optimized RNN, and fully connected layer. The adaptive reuse of weights and inputs, and data encoding/decoding techniques reduces power consumption and peak memory bandwidth of DRL processing by 31% and 41%, respectively. The 65-nm 16-mm <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> chip achieves a peak 2.16 TFLOPS/W at 0.73 V and 204 GFLOPS at 1.1 V with 16-bit data.