The Infrared Object Tracking (IOT) task aims to locate objects in infrared sequences. Since color and texture information is unavailable in infrared modality, most existing infrared trackers merely rely on capturing spatial contexts from the image to enhance feature representation, where other complementary information is rarely deployed. To fill this gap, we in this article propose a novel Asymmetric Deformable Spatio-Temporal Framework (ADSF) to fully exploit collaborative shape and temporal clues in terms of the objects. Firstly, an asymmetric deformable cross-attention module is designed to extract shape information, which attends to the deformable correlations between distinct frames in an asymmetric manner. Secondly, a spatio-temporal tracking framework is coined to learn the temporal variance trend of the object during the training process and store the template information closest to the tracking frame when testing. Comprehensive experiments demonstrate that ADSF outperforms state-of-the-art methods on three public datasets. Extensive ablation experiments further confirm the effectiveness of each component in ADSF. Furthermore, we conduct generalization validation to demonstrate that the proposed method also achieves performance gains in RGB-based tracking scenarios.