Data preprocessing is a key step in extracting useful information from sound and vibration data and often involves selecting a time-frequency representation. No single time-frequency representation is always optimal, and no standard method exists for selecting the appropriate time-frequency representation, so selecting the time-frequency representation requires expert knowledge and is susceptible to human bias. To address this, this work introduces a methodology to automate the selection of a time-frequency representation for a dataset using only a subset of the healthy, or normal, class of data. To select the parameters for each type of time-frequency representation, Bayesian optimization is used. With a candidate from each type of time-frequency representation, the average similarity is used to select the final candidate. Additionally, the use of multiple time-frequency representations within a single model is explored. Because there is currently no objective method to compare the selected time frequency representations against, the proposed methodology is evaluated in two case studies. In the case studies, the time frequency representations are used as inputs to a simple convolutional neural network that achieved 100% accuracy in classifying bearing faults and 94% accuracy in classifying the contact tip to workpiece distance in wire arc additive manufacturing. Additionally, the proposed methodology presents a 75% and 94% reduction in the data size for the two case studies. This offers further benefits for reducing costs of data transmission and storage in modern digital manufacturing architectures.