Abstract. In the field of machine learning, distributed machine learning (DML) has gained massive popularity in research and appeared in a wide range of applications including transportation, medicine, and business. Distributed machine learning showed huge potential in processing large amounts of data while keeping the data private. In this paper, a distributed implementation of a deep neural network (DNN) with a large dataset from a telecom company, aiming to predict consumer behavior based on usage data is carried out. Various datasets with distinct heterogeneity settings were applied to the network, simulating a real-world cloud application scenario. Comparisons between both the centralized model and the distributed models were made to analyze the impact on model performance with device heterogeneity and data heterogeneity. These result indicate that under certain circumstances, DML models showed better accuracy and lower loss compared to the centralized model. The paper provided some insights and made some assumptions on how the data and the parameters affect the model performance.