Spoken language understanding (SLU) is an essential part of a task-oriented dialogue system, which mainly includes intent detection and slot filling. Some existing approaches obtain enhanced semantic representation by establishing the correlation between two tasks. However, those methods show little improvement when applied to BERT, since BERT has learned rich semantic features. In this paper, we propose a BERT-based model with the probability-aware gate mechanism, called PAGM ( <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">P</u> robability <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">A</u> ware <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">G</u> ated <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">M</u> odel). PAGM aims to learn the correlation between intent and slot from the perspective of probability distribution, which explicitly utilizes intent information to guide slot filling. Besides, in order to efficiently incorporate BERT with the probability-aware gate, we design the stacked fine-tuning strategy. This approach introduces a mid-stage before target model training, which enables BERT to get better initialization for final training. Experiments show that PAGM achieves significant improvement on two benchmark datasets, and outperforms the previous state-of-the-art results.