Air pollution and aging population have caused increasing rates of lung diseases and elderly lung diseases year by year. At the same time, the outbreak of COVID-19 has brought challenges to the medical system, which placed higher demands on preventing lung diseases and improving diagnostic efficiency to some extent. Artificial intelligence can alleviate the burden on the medical system by analyzing lung sound signals to help to diagnose lung diseases. The existing models for lung sound recognition have challenges in capturing the correlation between time and frequency information. It is difficult for convolutional neural network to capture multi-scale features across different resolutions, and the fusion of features ignores the difference of influences between time and frequency features. To address these issues, a lung sound recognition model based on multi-resolution interleaved net and time-frequency feature enhancement was proposed, which consisted of a heterogeneous dual-branch time-frequency feature extractor (TFFE), a time-frequency feature enhancement module based on branch attention (FEBA), and a fusion semantic classifier based on semantic mapping (FSC). TFFE independently extracts the time and frequency information of lung sounds through a multi-resolution interleaved net and Transformer, which maintains the correlation between time-frequency features. FEBA focuses on the differences in the influence of time and frequency information on prediction results by branch attention. The proposed model achieved an accuracy of 91.56% on the combined dataset, by an improvement of over 2.13% compared to other models.