Sleep monitoring typically requires the uncomfortable and expensive polysomnography (PSG) test to determine the sleep stages. Body movement and cardiopulmonary signals provide an alternative way to perform sleep staging. In recent years, long-short term memory (LSTM) networks and convolutional neural networks (CNN) have dominated automatic sleep staging due to their better learning ability than machine learning classifiers. However, LSTM may lose information when dealing with long sequences, while CNN is not good at sequence modeling. As an improvement, we develop a hierarchical attention-based deep learning method for sleep staging using body movement, electrocardiogram (ECG), and abdominal breathing signals. We apply the multi-head self-attention to model the global context of feature sequences and coupled it with CNN to achieve a hierarchical self-attention weight assignment. We evaluate the performance of the method using two public datasets. Our method outperforms other baselines in the three sleep stages, achieving an accuracy of 84.3%, an F1 score of 0.8038, and a Cohen's Kappa coefficient of 0.7036. The result demonstrates the effectiveness of the hierarchical self-attention mechanism when processing feature sequences in the sleep stage classification problem. This paper provides new possibilities for long-term sleep monitoring using movement and cardiopulmonary signals obtained from non-invasive devices. The code can be found at: https://github.com/scutrd/attention-sleep-staging.