In hyperspectral image (HSI) classification, convolutional neural networks (CNNs) and transformer architectures have each contributed to considerable advancements. CNNs possess potent local feature representation skills, whereas transformers excel in learning global features, offering a complementary strength. Nevertheless, both architectures are limited by static receptive fields, which hinder their accuracy in delineating subtle boundary discrepancies. To mitigate the identified limitations, we introduce a novel dual-branch adaptive convolutional transformer (DBACT) network architecture featuring an adaptive multi-head self-attention mechanism. The architecture begins with a triadic parallel stem structure for shallow feature extraction and reduction of the spectral dimension. A global branch with adaptive receptive fields performs high-level global feature extraction. Simultaneously, a local branch with a cross-attention module provides detailed local insights, enriching the global perspective. This methodical integration synergizes the advantages of both branches, capturing representative spatial-spectral features from HSI. Comprehensive evaluation across three benchmark datasets reveals that the DBACT model exhibits superior classification performance compared to leading-edge models.