Accurate forecasting is required for the effective risk management of drought disasters. Many machine learning- and deep learning-based models have been proposed for drought forecasting, however, they cannot handle the temporal and/or spatial dependencies in the input data, causing unexpected forecasting results. In order to solve the challenging issue, in this paper we proposed the Heterogeneous Spatio-Temporal Graph (HetSPGraph), for drought forecasting. It includes three major layers: spatial aggregations including inter and intra aggregations, temporal aggregation, and a forecasting network. The main function of HetSPGraph is to learn the dynamic spatiotemporal correlations between the regions and to further predict the drought in different regions, based on which accurate drought forecasting can be achieved. Experimental forecasting results of the Standardized Precipitation Evapotranspiration Index (SPEI) in China indicated that the HetSPGraph model outperformed the traditional baseline methods including the Long Short-Term Memory model (LSTM), Convolutional Neural Network-LSTM (CNN-LSTM), Gated Recurrent Unit (GRU), Spatio-Temporal Graph Convolutional Networks (STGCN) and Geographic-Semantic-Temporal Hypergraph Convolutional Network (GST-HCN). Even for long-term forecasting (12 months), more accurate forecasting results, with the coefficient of determination R2 higher than 0.89, can also be obtained by HetSPGraph compared to the other three models. The proposed HetSPGraph model has the potential for wider use in forecasting drought and other natural disasters.