Federated learning (FL) has been considered as a promising approach for enabling distributed learning without sacrificing edge-devices’ (EDs’) data privacy. However, training machine learning (ML) model distributively is challenging to the EDs with limited energy supply. In this work, we consider that the central parameter-server (which is co-located with the cellular base station, BS) exploits simultaneous wireless information and power transfer (SWIPT) to simultaneously send the aggregated model to all EDs and also transfer energy to them. With the collected energy from the BS, each ED firstly trains its local model, and then all EDs form a non-orthogonal multiple access (NOMA) cluster for sending their local models to the central parameter-server (i.e., the BS). To minimize the total energy consumption, we formulate a joint optimization of the BS’s SWIPT duration, each ED’s power-splitting ratio, the EDs’ NOMA uploading duration as well as the EDs’ and BS’s processing-rates. To address the non-convexity of the joint optimization problem, we decompose it into two subproblems, both of which are efficiently solved. We then propose an enhanced block coordinate descent (EBCD) algorithm, which iteratively solves the two subproblems in sequence and exploits the idea of simulated annealing to avoid being trapped by some local optimum, to approach to the optimal solution of the original joint optimization problem. By using our EBCD-Algorithm, we further investigate how to properly select the EDs to participate in the FL process, with the objective of minimizing all participants’ total energy consumption. A cross-entropy based algorithm, which exploits our EBCD-Algorithm as a subroutine, is proposed to determine the optimal ED-selection. Numerical results are provided to validate our proposed algorithms and demonstrate the performance advantage of the proposed SWIPT assisted FL via NOMA in comparison with some conventional FL schemes.