This article proposes a novel reinforcement learning algorithm using an improved Monte Carlo tree search (IMCTS) formulation for the discrete optimum design of truss structures. IMCTS with multiple root nodes includes the update process, best reward, accelerating technique and terminal condition. The update process means that once a final solution is found, it is used as the initial solution for the next search tree. The best reward is used in the backpropagation step. The accelerating technique is introduced by decreasing the width of the search tree and reducing the maximum number of iterations. The agent is trained to minimize the total structural weight under various constraints until the terminal condition is satisfied. The optimal solution is the minimum value of all solutions found by the search trees. Numerical examples show that the agent can find the optimal solution with low computational cost, stably produces an optimal design, and is suitable for multi-objective structural optimization and large-scale structures.