The use of end-to-end deep learning in machinery health monitoring allows machine learning models to be created without the need for feature engineering. The research presented here expands on this use in the context of tool wear monitoring. A disentangled-variational-autoencoder, with a temporal convolutional neural network, is used to model and trend tool wear in a self-supervised manner, and anomaly detection is used to make predictions from both the input and latent spaces. The method achieves a precision-recall area-under-curve (PR-AUC) score of 0.45 across all cutting parameters on a milling dataset, and a top score of 0.80 for shallow depth cuts. The method achieves a top PR-AUC score of 0.41 on a real-world industrial CNC dataset, but the method does not generalise as well across the broad range of manufactured parts. The benefits of the approach, along with the drawbacks, are discussed in detail.