Proactive content caching has emerged as a promising solution to cope with exponentially increasing mobile data traffic. The popular user contents can be cached near the network edge for faster retrieval and processing. Current state-of-the-art approaches adopt a centralized model training mechanism that requires high communication and data exchange overheads to predict content popularity. Moreover, these approaches fail to deal with the dynamicity of the environment since they do not take into account the users’ mobility information and are unable to incorporate content offload timings. In this paper, we address these limitations by proposing a novel federated learning-based Mobility and Demand-aware Proactive Content Offloading (MDPCO) framework. MDPCO exploits distributed learning strategies and capitalizes on users’ mobility and demand information for proactive content offloading. Extensive simulations are carried out to validate the efficacy of MDCPO against local and cloud-based models. Our proposed model yields an average performance improvement of 6.7% in comparison to the cloud-based model. Furthermore, with the increase in the number of fog servers, the MDPCO achieves a 9.8% higher data offloading ratio and 1.18% increase in the downlink rates while being more energy-efficient than cloud-based approaches.