In this work, a surrogate model for structural, transient and discontinuously excited finite element method simulations is developed. This allows to reduce the computational effort of repeated calculations of identical models under different load cases. The architecture of the surrogate combines fully connected neural network layers with long short-term memory layers. For the reproduction of different damping ratios, a categorical variable is added to the continuous input data. Based on a recursive flow of the predicted data back to the input layer, long-term dependencies do not vanish due to short-input sequences. The system dimension is reduced by applying the model-order reduction technique for modal decomposition. The high accuracy of the surrogate and the reduction of computational costs are shown on an academic example of a cantilever beam and a real-world example of a robot. The advantages of our approach are illustrated in comparison with state-of-the-art surrogates for transient finite element analysis. By using the surrogate proposed in this study, oscillations due to discontinuous excitation of mechanical structures can be reproduced. For this purpose, only short-input sequences are necessary since the excitation of the oscillations does not have to be part of the input sequence during the whole duration of the oscillations. Due to the categorical variable for the damping ratio, the surrogate can account for the influence of different damping in parameter studies.