In this study, we proposed a novel dataset and a deep learning model that can generate three-dimensional (3D) dynamic scene graphs for robotic manipulation tasks. First, we defined a new 3D scene graph to effectively represent the dynamics of a robotic manipulation task environment. Subsequently, we collected a series of input sensory data by conducting multiple manipulation tasks in a simulated environment. Based on the collected sensory data and the corresponding 3D scene graphs, we constructed a dataset, namely, D3DSG, for training and validating a scene graph generation model. In addition, we proposed a ST-GCN based context reasoning module that can utilize both rich spatial and temporal contexts, after which an effective 3D scene graph generation model, namely, SG4RMT, which consisted of a 6DoF pose estimation module and a spatio-temporal context reasoning module, was presented. The superiority and high performance of the proposed SG4RMT model were demonstrated by performing multiple experiments using the D3DSG dataset.