Abstract Regional climate models (RCMs) are essential tools for simulating and studying regional climate variability and change. However, their high computational cost limits the production of comprehensive ensembles of regional climate projections covering multiple scenarios and driving Global climate models (GCMs) across regions. RCM emulators based on deep learning models have recently been introduced as a cost-effective and promising alternative that requires only short RCM simulations to train the models. Therefore, evaluating their transferability to different periods, scenarios, and GCMs becomes a pivotal and complex task in which the inherent biases of both GCMs and RCMs play a significant role. Here, we focus on this problem by considering the two different emulation approaches introduced in the literature as perfect and imperfect, that we here refer to as perfect prognosis (PP) and model output statistics (MOS), respectively, following the well-established downscaling terminology. In addition to standard evaluation techniques, we expand the analysis with methods from the field of explainable artificial intelligence (XAI), to assess the physical consistency of the empirical links learnt by the models. We find that both approaches are able to emulate certain climatological properties of RCMs for different periods and scenarios (soft transferability), but the consistency of the emulation functions differs between approaches. Whereas PP learns robust and physically meaningful patterns, MOS results are GCM dependent and lack physical consistency in some cases. Both approaches face problems when transferring the emulation function to other GCMs (hard transferability), due to the existence of GCM-dependent biases. This limits their applicability to build RCM ensembles. We conclude by giving prospects for future applications. Significance Statement Regional climate model (RCM) emulators are a cost-effective emerging approach for generating comprehensive ensembles of regional climate projections. Promising results have been recently obtained using deep learning models. However, their potential to capture the regional climate dynamics and to emulate other periods, emission scenarios, or driving global climate models (GCMs) remains an open issue that affects their practical use. This study explores the potential of current emulation approaches incorporating new explainable artificial intelligence (XAI) evaluation techniques to assess the reliability and transferability of the emulators. Our findings show that the different global and regional model biases involved in the different approaches play a key role in transferability. Based on the results obtained, we provide some prospects for potential applications of these models in challenging problems.