Abstract Structural design synthesis considering discrete elements can be formulated as a sequential decision process solved using deep reinforcement learning, as shown in prior work. By modeling structural design synthesis as a Markov decision process (MDP), the states correspond to specific structural designs, the discrete actions correspond to specific design alterations, and the rewards are related to the improvement in the altered design’s performance with respect to the design objective and specified constraints. Here, the MDP action definition is extended by integrating parametric design grammars that further enable the design agent to not only alter a given structural design’s topology, but also its element parameters. In considering topological and parametric actions, both the dimensionality of the state and action space and the diversity of the action types available to the agent in each state significantly increase, making the overall MDP learning task more challenging. Hence, this paper also addresses discrete design synthesis problems with large state and action spaces by significantly extending the network architecture. Specifically, a hierarchical-inspired deep neural network architecture is developed to allow the agent to learn the type of action, topological or parametric, to apply, thus reducing the complexity of possible action choices in a given state. This extended framework is applied to the design synthesis of planar structures considering both discrete elements and cross-sectional areas, and it is observed to adeptly learn policies that synthesize high performing design solutions.