As network infrastructures continue to grow and traffic encryption technologies evolve at a rapid pace, the task of classifying encrypted traffic has become significantly more intricate. These emerging encryption methods render conventional approaches ineffectual in discerning traffic types, consequently posing novel challenges to network security and administration. Evidently, conventional traffic classification techniques are inadequate when it comes to encrypted traffic. Consequently, researchers have turned to machine learning and deep learning models to address this challenge, achieving remarkable results in this domain. Nonetheless, contemporary deep learning models exhibit a propensity to overly depend on self-attention mechanisms while processing 2D feature maps. This mechanism typically focuses only on individual query-key pairs, neglecting the rich contextual information among adjacent keys, thereby limiting their performance in encrypted traffic classification. To address this limitation, our study examines an innovative approach called CoTNet. The CoT module is integrated into the ResNet model to more comprehensively exploit the contextual associations among input keys. This innovation engenders a sturdier and more potent classification model, adept at comprehensively capturing the inherent patterns and correlated information within input features. The suggested method enhances the ResNet model by substituting the conventional 3x3 convolution operations with CoT modules, thereby more effectively harnessing the contextual associations among input keys. In particular, we integrate self-attention mechanisms at various model levels to more thoroughly capture the inherent patterns and correlated information within input features. Experimental results on two real-world datasets show that CoTNet outperforms multiple state-of-the-art methods in the encrypted traffic classification task.