Artificial intelligence on edge is a growing research field. In this paper, we propose a novel re-encoding scheme for reducing the size of the weights of deep neural networks (DNNs). The proposed re-encoding scheme exploits the Booth encoding scheme and the power-of-two (PO2) quantization to allow for very low energy computations during the inference of the neural networks with minimal loss in classification accuracy. We demonstrate the advantages of the proposed re-encoding scheme by computing a convolutional neural network (CNN) and a linear neural network on the proposed Extended Exact Multiplier and the proposed PO2 Multiplier. Our proposed PO2 quantization and re-encoding method reduce the model size for the CNN by 30.77% and the model size of the linear neural network by 49.86%. Furthermore, our multipliers reduce the inference energy for CNN by 50.6% and for the linear neural network by 90.1%. The PO2 Multiplier is proposed for the sensor-end computation of the linear neural network with a 77.32% reduction in the area relative to an exact Booth multiplier and it reduces the inference energy consumption of the linear neural network by 93.2% when compared to the unmodified exact multiplier. Our proposed scheme can be used to improve the energy consumption during inference for most Booth multipliers with minor modifications to the re-encoding signal arrangements. We also demonstrate that the proposed re-encoding scheme paired with the proposed multipliers outperforms all the existing designs in terms of resource utilization with a minimal impact on the inference accuracy of the neural networks.