In the field of computer vision, fine-grained image retrieval is an extremely challenging task due to the inherently subtle intra-class object variations. In addition, the high-dimensional real-valued features extracted from large-scale fine-grained image datasets slow the retrieval speed and increase the storage cost. To solve above issues, existing fine-grained image retrieval methods mainly focus on finding more discriminative local regions for generating discriminative and compact hash codes, which achieve limited fine-grained image retrieval performance due to the large quantization errors and the confounding granularities and context of discriminative parts, i.e., the correct recognition of fine-grained objects mainly attribute to the discriminative parts and their context. To learn robust causal features and reduce the quantization errors, we propose a deep progressive asymmetric quantization (DPAQ) method based on causal intervention to learn compact and robust descriptions for fine-grained image retrieval task. Specifically, we introduce a structural causal model to learn robust casual features via causal intervention for fine-grained visual recognition. Subsequently, we design a progressive asymmetric quantization layer in the feature embedding space, which can preserve the semantic information and reduce the quantization errors sufficiently. Finally, we incorporate both the fine-grained image classification and retrieval tasks into an end-to-end deep learning architecture for generating robust and compact descriptions. Experimental results on several fine-grained image retrieval datasets demonstrate that the proposed DPAQ method performs the best for fine-grained image retrieval task and surpasses the state-of-the art fine-grained hashing methods by a large margin.