Subcortical brain structure segmentation plays an important role in the diagnosis of neuroimaging and has become the basis of computer-aided diagnosis. Due to the blurred boundaries and complex shapes of subcortical brain structures, labeling these structures by hand becomes a time-consuming and subjective task, greatly limiting their potential for clinical applications. Thus, this paper proposes the sparsification transformer (STF) module for accurate brain structure segmentation. The self-attention mechanism is used to establish global dependencies to efficiently extract the global information of the feature map with low computational complexity. Also, the shallow network is used to compensate for low-level detail information through the localization of convolutional operations to promote the representation capability of the network. In addition, a hybrid residual dilated convolution (HRDC) module is introduced at the bottom layer of the network to extend the receptive field and extract multi-scale contextual information. Meanwhile, the octave convolution edge feature extraction (OCT) module is applied at the skip connections of the network to pay more attention to the edge features of brain structures. The proposed network is trained with ahybrid loss function. The experimental evaluation on two public datasets: IBSR and MALC, shows outstanding performance in terms of objective and subjective quality.