This paper explores the application of deep reinforcement learning for autonomously designing noise-mitigating structures. Specifically, deep Q- and double deep Q-networks are employed to find material distributions that result in broadband noise mitigation for reflection and transmission problems. Unlike conventional deep learning approaches which require prior knowledge for data labeling, the double deep Q-network algorithm learns configurations that result in broadband noise mitigations without prior knowledge by utilizing pixel-based inputs. By employing unified hyperparameters and network architectures for transmission and reflection problems, the capability of the algorithms to generalize over different environments is demonstrated. In addition, a comparison with a genetic algorithm highlights the potential for generalized design in complex environments, despite the algorithms tending to predict local maxima. Furthermore, we examine the impact of hyperparameters and environment types on agent performance. The autonomous design approach offers generalized learning while avoiding restrictions to specific shapes or prior knowledge of the task.