Pain is a common phenomenon in clinical patients, which indicates patients are suffering from uncomfortable conditions for necessary treatments. So the assessment of pain status becomes a significant task in current medical institutions. Of late, various conventional hand-crafted or deep learning methods on face images are presented to estimate pain intensity automatically. However, these approaches usually feed the whole face into the automatic estimation system and explore little information on the interdependencies of related regions during the formation of pain expression. In this paper, a hierarchical deep network (HDN) involving regional and holistic information simultaneously is proposed via two scale branches. In HDN, a region-wise branch is designed to extract features from pain related regions of face images while a global-wise branch explores the interdependencies of pain related regions. Besides, in global-wise branch, a multi-task learning method is employed to detect action units while estimating pain intensity. Finally, the pain estimation outputs of two branches are fused in a decision level. On current pain estimation benchmarks, it is empirically shown that the proposed HDN outperforms the existing methods and the essential components in HDN have key influences on final prediction.