To improve communication stability, more wireless devices transmit multi-modal signals while operating. The term 'modal' refers to signal waveforms or signal types. This poses challenges to traditional specific emitter identification (SEI) systems, e.g., unknown modal signals require extra open-set mode identification; different modes require different radio frequency fingerprint (RFF) extractors and SEI classifiers; and it is hard to collect and label all signals. To address these issues, we propose an enhanced SEI system consisting of a universal RFF extractor, denoted as multiple synchrosqueezed wavelet transformation of energy unified (MSWTEu), and a new generative adversarial network for feature transferring (FTGAN). MSWTEu extracts uniform RFF features for different modal signals, FTGAN transfers different modal features to a recognized distribution in an unsupervised manner, and a novel training strategy is proposed to achieve emitter identification across multi-modal signals using a single clustering method. To evaluate the system, we built a hybrid dataset, which consists of multi-modal signals transmitted by various emitters, and built a complete civil air traffic control radar beacon system (ATCRBS) dataset for airplanes. The experiments show that our enhanced SEI system can resolve the SEI problems associated with crossing signal modes. It directly achieves 86% accuracy in cross-modal emitter identification using an unsupervised classifier, and simultaneously obtains 99% accuracy in open-set recognition of signal mode.