Astragalus is a widely used traditional Chinese medicine material that is easily confused due to its quality, price and other factors derived from different origins. This article describes a novel method for the rapid tracing and detection of Astragalus via the joint application of an electronic tongue (ET) and an electronic eye (EE) combined with a lightweight convoluted neural network (CNN)-transformer model. First, ET and EE systems were employed to measure the taste fingerprints and appearance images, respectively, of different Astragalus samples. Three spectral transform methods - the Markov transition field, short-time Fourier transform and recurrence plot - were utilized to convert the ET signals into 2D spectrograms. Then, the obtained ET spectrograms were fused with the EE image to obtain multimodal information. A lightweight hybrid model, termed GETNet, was designed to achieve pattern recognition for the Astragalus fusion information. The proposed model employed an improved transformer module and an improved Ghost bottleneck as its backbone network, complementarily utilizing the benefits of CNN and transformer architectures for local and global feature representation. Furthermore, the Ghost bottleneck was further optimized using a channel attention technique, which boosted the model's feature extraction effectiveness. The experiments indicate that the proposed data fusion strategy based on ET and EE devices has better recognition accuracy than that attained with independent sensing devices. The proposed method achieved high precision (99.1%) and recall (99.1%) values, providing a novel approach for rapidly identifying the origin of Astragalus, and it holds great promise for applications involving other types of Chinese herbal medicines. © 2024 Society of Chemical Industry.