Ensuring the safety and reliability of rotating machinery in modern industrial production and intelligent manufacturing is of paramount importance. While deep learning-based fault diagnosis methods offer promise, the scarcity of fault samples and the variations in distributions between training and test data due to variable working conditions make it difficult for these methods to be applied in industrial scenarios. To surmount these obstacles, we present a novel solution: learning to generalize with latent embedding optimization. Our proposed method, tailored for few-shot and zero-shot cross domain fault diagnosis, shows promise in addressing the industrial fault diagnosis problems under small samples and various working condition. The proposed method builds upon the latent embedding optimization algorithm which capitalizes on the essence of meta-learning, effectively addressing few-shot challenges. Additionally, we harness an efficient pretraining model, enhancing feature extraction and domain adaptation, effectively handling the scarcity of fault data. In tackling the cross domain issue, we introduce an innovative meta-task organization and amplify the episode training strategy within meta-learning. These enhancements empower the model to develop the ability to generalize effectively. The proposed approach is substantiated through comprehensive case studies in bearing and gearbox fault diagnosis. The results demonstrate its exceptional efficacy in both few-shot and zero-shot cross domain fault diagnosis scenarios.