Missing data are quite common in the industrial field. Since most data driven methods used in these applications rely on complete and high-quality data set, it is important to handle the missing data problem. Also, the severity of missing data varies across factories, which means that a single factory could fail to handle missing data locally. With the rapid development of cloud–edge computing, different factories could work together to handle missing data problem by federated learning without sharing their private training data. However, popular federated imputation methods assume each edge, i.e., a factory, to be an equal participant during learning a central model in the cloud, and thus are unable to handle heterogeneous data across different clients, leading to slow convergence and degraded learning performance. In this paper, a federated transfer missing data imputation method (FedTMI) is proposed to address this dilemma. Firstly, edge models are built with traditional Generative Adversarial Imputation Nets (GAIN) trained on edge data sets and edge knowledge is extracted as knowledge vectors to identify variables which could provide room for performance improvement. Secondly, for a certain target edge, with edge models and edge knowledge being accumulated in the cloud, models from non-target edges are chosen as helper models following certain rules aided by the corresponding edge knowledge. The helper models could provide effective guidance for data imputation in the target edge. Thirdly, the target edge executes federated transfer learning with the selected helper models. Model knowledge of helper models is transferred to the target edge, forming an updated target edge model with its edge data. Case studies on steam-driven water pumps in thermal power plants show the feasibility of the proposed FedTMI. It outperforms the baseline method with model averaging (FedAvg), especially when the data are not independent and identically distributed (non-i.i.d.) across different edges.