Deep learning models implemented using memristors show high scalability and high energy efficiency, promising a compact and efficient computing architecture for resourceconstrained edge computing applications. These technologies integrate both data storage and computation simultaneously in a highly parallel memristor crossbar array architecture. However, the significant variations arising from the inherent physical randomness of memristors lead to a large performance degradation of deep learning models. The challenges of extensive energy costs and transfer time for deployment to maintain performance are faced. In this paper, for the first time, we propose a unified architecture that consists of a Bayesian-based training method and a lightweight transfer scheme. The proposed architecture can tackle the robustness, energy and time consumption issues caused by memristor variations. Our experimental results show that our architecture can double the speed and energy efficiency of deploying deep learning models.