Coronary artery disease (CAD) is an irreversible and fatal disease. It necessitates timely and precise diagnosis to slow CAD progression. Electrocardiogram (ECG) and phonocardiogram (PCG), conveying abundant disease-related information, are prevalent clinical techniques for early CAD diagnosis. Nevertheless, most previous methods have relied on single-modal data, restricting their diagnosis precision due to suffering from information shortages. To address this issue and capture adequate information, the development of a multi-modal method becomes imperative. In this study, a novel multi-modal learning method is proposed to integrate both ECG and PCG for CAD detection. Along with deconvolution operation, a novel ECG-PCG coupling signal is evaluated initially to enrich the diagnosis information. After constructing a modified recurrence plot, we build a parallel CNN network to encode multi-modal information, involving ECG, PCG and ECG-PCG coupling deep-coding features. To remove irrelevant information while preserving discriminative features, we add an autoencoder network to compress feature dimension. Final CAD classification is conducted by combining support vector machine and optimal multi-modal features. The experiment is validated on 199 simultaneously recorded ECG and PCG signals from non-CAD and CAD subjects, and achieves high performance with accuracy, sensitivity, specificity and f1-score of 98.49%, 98.57%,98.57% and 98.89%, respectively. The result demonstrates the superiority of the proposed multi-modal method in overcoming information shortages of single-modal signals and outperforming existing models in CAD detection. This study highlights the potential of multi-modal deep-coding information, and offers a wider insight to enhance CAD diagnosis.