Textbook Question Answering (TQA) is a complex task oriented to multi-modal context, which requires reasoning on a diagram and a long essay to get the correct answer. There are mainly two related issues for the task. First, diagrams are mostly abstract expressions of real world and some constituents with similar appearance may have different semantics, which makes it difficult to understand them effectively. Secondly, a long essay contains abundant and useful information for question answering, which shows that it is vital to extract the relevant information from the abundant text and then perform reasoning on it. To address the two issues, we propose a new model, Dynamic Dual Graph Networks (DDGNet), which performs question-guided multi-step reasoning on the dynamic directed Diagram Graph Network (DGN) for the diagram and Textual Graph Network (TGN) for the most related paragraph extracted from a long essay. Specifically, DGN combines text features with positional features of text boxes in the diagram as the node feature to avoid the ambiguity of visual features for the abstract constituents and help express explicit semantics. TGN uses the representation of each sentence in the most related paragraph as the node feature to learn the contextualized interaction of the useful information in the graph reasoning process. For the reasoning strategy, we propose a question-guided multi-step graph reasoning method to update both DGN and TGN dynamically under the question guidance in every step. Experimental results show that our proposed model outperforms the baselines on the TQA dataset. Moreover, extensive ablation studies are also conducted to analyze the effectiveness of our proposed model.