Structural vibration identification is an important task in civil engineering that is based on processing measured data from structural monitoring. However, predicting the response at unsensed locations based on limited sensor data can be challenging. Deep learning (DL) methods have shown promise in vibration data feature extraction and generation, but they struggle to capture the underlying physics laws and dynamic equations that govern vibration identification. This paper presents a novel framework called physics-informed deep learning (PIDL) that combines deep generative networks with structural dynamics knowledge to address these challenges. The PIDL framework consists of a data-driven convolutional neural network for structural excitation identification and a physics-informed variational autoencoder for explicit time-domain (ETD) vibration analysis with the generated unit impulse response (UIR) signal of the measured structure. The proposed framework is evaluated on a benchmark structure for structural health monitoring, demonstrating its effectiveness in extracting physics-related dynamics features and accurately identifying excitation signals and latent physics parameters across different damage patterns. Additionally, the incorporation of an ETD method-aided convolution function in the loss function aligns the generated UIR signals with the dynamic properties of the measured structure. Compared with conventional DL-based vibration analysis methods, the PIDL framework offers improved accuracy and reliability by integrating structural dynamics knowledge. This study contributes to the advancement of structural vibration identification and showcases the potential of the PIDL framework in civil structure monitoring applications. This article is part of the theme issue 'Physics-informed machine learning and its structural integrity applications (Part 2)'.