Optical coherence tomography (OCT) imaging has become a point-of-care imaging modality for the diagnosis of retinal diseases. Varying speckle noise in the OCT images across datasets and scanners worsens the performance of existing artificial intelligence (deep learning) models, that have been trained mostly with images having a particular noise level. The existing deep learning models for predicting retinal diseases are heavy, requires a sophisticated computing environment to train and deploy. Generalized lightweight deep learning models that can provide an automated diagnosis on an edge platform are highly appealing in the clinic. This work proposes a self distillation framework based on lightweight deep learning models for building generalizable deep models for retinal disease diagnosis. The proposed approach with three different baseline models ResNet18, MobileNetV2 and ShuffleNetV2, has been validated on simulated and real-time noisy OCT B-scans spanning a range of SNRs from four OCT datasets. The proposed method significantly outperforms the existing methods with improvement (as high as 14%) in precision, accuracy, and F1-score, to show that the self distillation framework can provide more generalizability for automated retinal diagnosis.