Cross-modal hashing (CMH) has become widely used for large-scale multimedia retrieval. However, most current CMH methods focus on the closed retrieval scenario, not the real-world environments, i.e., complex and changing semantics. When data containing new class objects emerge, the current CMH has to retrain the model on all history training data, not the new data, to accommodate new semantics, but the never-stop upload of data on the Internet makes this impractical. In this paper, we devise a deep hashing method called Continual Cross-Modal Hashing with Gradient Aware Memory (CCMH-GAM) for learning binary codes of multi-label cross-modal data with increasing categories. CCMH-GAM is a two-step hashing architecture, one hashing network learns to hash the increasing semantics of data, i.e., label, into the semantic codes, and other modality-specific hashing networks learn to map data into the corresponding semantic codes. Specifically, to keep the encoding ability for old semantics, a regularization based on accumulating low-storage label-code pairs is designed for the former network. For the modality-specific networks, we propose a memory construction method via approximating the full episodic gradients of all data by some exemplars and derive its fast implementation with the upper bound of approximation error. Based on this memory, we propose a gradient projection method to theoretically improve the probability of old data’s code being unchanged after updating the model. Extensive experiments on three datasets demonstrate that CCMH-GAM can continually learn hash functions and yield state-of-the-art retrieval performance.