Hyperspectral image (HSI) has rich spatial-spectral information, high spectral correlation and large redundancy between information. Due to the sparse background distribution of HSI, existing methods generally perform poorly for the classification of class pixels located in the boundary areas of land cover categories. This is largely because the network is vulnerable to surrounding redundant information during the training stage, leading to inaccurate feature extraction and thus poor generalization ability of the model. Based on previous work, we propose a HSI classification network called MATNet which combines multi-attention and Transformer. The network first uses spatial attention and channel attention to pay more attention to the more significant information parts, then uses tokenizer module to make a semantic level representation of different categories of ground objects, and then performs deep semantic feature extraction using the transformer encoder module. Finally, we design a loss function called Lpoly, which adds a polynomial to the label smoothing loss to tune the original first polynomial to accommodate different datasets and tasks. We perform experiments in several well-known HSI datasets as well as for visualization. The results show that our proposed MATNet performs well in extracting spatial-spectral features of HSIs as well as understanding semantic degrees of semantic degrees.