Transformer is a deep learning model applying self-attention mechanism which is widely used in solving sequence-to-sequence questions, including speech recognition. After Transformer was proposed, it has been greatly developed and made great progress in the field of speech recognition. Recurrent Neural Network (RNN) is also a model that can be used in speech recognition. Speech recognition is a kind of sequence-to-sequence question that can transform human speech into text form. Both RNN and Transformer use encoder-decoder architecture to solve sequence-to-sequence questions. However, RNN is a recurrent model, weak in parallel training, and it will not perform quite well as Transformer in sequence-to-sequence question, which is a non-recurrent model. This paper mainly analyzes the accuracy Transformer and RNN in automatic speech recognition. It shows that Transformer performs better than RNN in speech recognition area, having higher accuracy, and it therefore provides evidence that Transformer can be an efficacious approach to automatic speech recognition as well as a practical substitution for traditional ways like RNN.