The automated classification of hazardous events in Air Traffic Management (ATM) involves concise texts with specialized terminology. The problem is currently addressed by machine learning text classification methods; however, some methods overlook domain knowledge, while others require it excessively. This research aims to identify a process for integrating low-cost, text-based domain knowledge into classification to address the issue of domain shift. To achieve this, the Wide and Deep Bidirectional Encoder Representations from Transformers (WD-BERT) model is proposed, featuring a unique knowledge-powered attention mechanism. WD-BERT's Wide Attention module assesses the compatibility between input text and domain knowledge to determine class probabilities, while its Deep Attention module extracts contextual features guided by domain knowledge. Additionally, a text mining method is employed to extract domain knowledge from ATM regulation texts. The model is trained and evaluated using a dataset derived from the China ATM Hazard Source Database. It achieves a multi-label classification accuracy of 80.24% and an F1-micro score of 0.88, outperforming comparative models with state-of-the-art performance. Breakdown analyses of the model indicate that the carefully designed attention mechanism endows WD-BERT with the ability to integrate domain knowledge, allowing it to excel in contexts involving complex sentences, multiple terms, and domain shifts. The innovation of this study lies in the proposed domain knowledge-powered attention mechanism, which allows simply organized domain knowledge to effectively guide ATM hazardous events classification, alleviating the impact of domain shift. This method can also be applied to other vertical domains without knowledge graphs.