The application of deep hashing methods for cross-modal retrieval has seen growing interest due to their storage efficiency and fast query execution. However, the challenge posed by the “heterogeneity gap” in multi-modal datasets cannot be understated. To address this, we present a novel framework named Dual-Pathway Deep Hashing-Based Adversarial Learning (DP-DHAL), engineered to surmount this challenge. The architecture of DP-DHAL integrates three key components: (a) a dual-pathway representation learning module tasked with extracting modality-specific features; (b) an adversarial module working to align the distributions of cross-modal features; and (c) a deep hashing module responsible for generating hash codes that uphold the similarity relationships across different modalities. Additionally, we have developed a unique Hamming triplet-margin loss function to refine the assessment of content similarities. The DP-DHAL model is trained through an adversarial process where the adversarial module’s goal is to discern cross-modal features with the aim of reducing the heterogeneity gap. Simultaneously, the representation learning module is focused on producing representations that can both deceive the adversarial module and preserve cross-modal similarities to yield distinctive hash codes. Comprehensive experiments on varied datasets have shown that our proposed method outperforms other leading cross-modal hashing techniques.