Timely assessment and prediction of changes in microbial compositions leading to activated sludge settling problems, such as filamentous bulking (FB), can reduce water resource recovery facilities (WRRFs) upsets, operational challenges, and negative environmental impacts. This study presents a computer vision approach to assess activated sludge-settling characteristics based on Microscopy Images (MIs). We utilize MIs to train deep convolutional neural networks (CNN) using transfer learning to investigate the morphological properties of flocs and filaments. The methodology was tested on the offline MI dataset collected over two years at a full-scale industrial WRRF in Belgium. Various CNN architectures were tested, including Inception v3, ResNet18, ResNet152, ConvNeXt-nano, and ConvNeXt-S. The sludge volume index (SVI) was used as the final prediction variable, but the method can be easily adjusted to predict any other settling metric of choice. The best-performing CNN, ConvNeXt-nano, could predict SVI values with MAE (37.51 ± 4.02), MTD (11.65 ± 1.94), MAPE (0.18 ± 0.02), and R2 (0.75 ± 0.05). The model was tested in real-life FB events, where it identified early indicators of bulking by predictive surges in SVI values. We used an explainable AI technique, Eigen-CAM, to discover key morphological indicators of sludge bulking transitions. The findings highlight the SVI multimodality issue, where SVI readings as a unidimensional metric could not capture delicate shifts from good to poor sludge settling, while the model detected these subtle changes. The key morphological attributes of threshold conditions leading to FB were identified, which can provide actionable insight for preemptive WRRF management.