An OCT-based multimodal deep learning classification model including texture information is introduced that outperforms single-modal models and multimodal models without texture information for glaucoma diagnosis in eyes with and without high myopia. To evaluate the diagnostic accuracy of a multimodal deep learning (DL) classifier using wide optical coherence tomography (OCT) optic nerve head cube scans in eyes with and without axial high myopia. 371 primary open-angle glaucoma (POAG) eyes and 86 healthy eyes, all without axial high myopia (axial length AL≤26mm) and 92 POAG eyes and 44 healthy eyes, all with axial high myopia (AL>26mm) were included. The multimodal DL classifier combined features of 3 individual VGG-16 models: 1) texture-based en face image, 2) retinal nerve fiber layer (RNFL) thickness map image and 3) confocal scanning laser ophthalmoscope (cSLO) image. Age, AL, and disc area adjusted area under the receiver operating curves (AUROC) were used to compare model accuracy. Adjusted AUROC for the multimodal DL model was 0.91 (95% CI = 0.87, 0.95). This value was significantly higher than the values of individual models (0.83 [0.79, 0.86] for texture based en face image; 0.84 [0.81, 0.87] for RNFL thickness map; and 0.68 [0.61, 0.74] for cSLO image; all P≤0.05). Using only highly myopic eyes, the multimodal DL model showed significantly higher diagnostic accuracy (0.89 [0.86 , 0.92]) compared to texture en face image (0.83 [0.78 , 0.85]), RNFL (0.85 [0.81 , 0.86]) and cSLO image models (0.69 [0.63 , 0.76]) (all P≤0.05). Combining OCT-based RNFL thickness maps with texture based en face images showed better ability to discriminate between healthy and POAG than thickness maps alone, particularly in high axial myopic eyes.