Predicting the future actions of multiple pedestrians is an essential feature for autonomous robots co-working in human crowded-environments. Estimating the unknown future path is a challenging problem due to the complex interactions occurring among pedestrians. Although recent developments in Graph Convolutional Network (GCN) allow for efficient encoding of such complex interactions, the encoded representations still lack the informative factors necessary to accurately predict their future behavior. To solve this, we introduce Disentangled GCN (DGCN) which aims to better capture the crowd interactions by decoupling the spatial and temporal factors. More specifically, we propose to encode the crowd interactions with two low-dimensional latent spaces: spatial latent and temporal latent. and decode the pedestrian's future behavior using the learned latents. We propose a novel regularizer function to train these latents in an unsupervised manner and condition the trajectory prediction on the learned latents using a spatially aware graph decoder. The proposed method is evaluated extensively on publicly available datasets consisting of pedestrians and vehicles. Our method improves mADE on ETH/UCY pedestrians dataset and achieves new state-of-the-art mFDE results on nuScenes vehicle datasets.