The voltage sags’ cause recognition is the basis for formulating governance plans and clarifying liabilities for accidents. For voltage sag cause recognition methods based on physical characteristics, new challenges are presented in terms of accuracy, adaptability and algorithm efficiency. Deep learning is a method based on characterizing and learning data. The efficient mechanism of autonomous feature learning can effectively overcome the problems of information loss and generalization ability based on existing physical property methods. The long short-term memory network (LSTM) has the characteristics of memory and can better learn the data characteristics with time series characteristics. Bidirectional LSTM can consider historical information and future information compared with standard LSTM, and has more processing for time series data. While using the attention mechanism can highlight the key influencing factors in the time series and improve the recognition accuracy of the model. For the transient sag time series data, this paper proposes a multi-layer structure based on bidirectional LSTM and attention mechanism to classification recognition. The experiment uses simulation data and measured data to prove that the model has good recognition ability and good anti-noise performance in the recognition of voltage sag causes, and can be reliably applied in practical engineering.