To facilitate the triage of hits from small molecule screens, we have used various AI/ML techniques and experimentally observed data sets to build models aimed at predicting colloidal aggregation of small organic molecules in aqueous solution. We have found that Naïve Bayesian and deep neural networks outperform logistic regression, recursive partitioning tree, support vector machine, and random forest techniques by having the lowest balanced error rate (BER) for the test set. Derived predictive classification models consistently and successfully discriminated aggregator molecules from nonaggregator hits. An analysis of molecular descriptors in favor of colloidal aggregation confirms previous observations (hydrophobicity, molecular weight, and solubility) in addition to undescribed molecular descriptors such as the fraction of sp3 carbon atoms (Fsp3), and electrotopological state of hydroxyl groups (ES_Sum_sOH). Naïve Bayesian modeling and scaffold tree analysis have revealed chemical features/scaffolds contributing the most to colloidal aggregation and nonaggregation, respectively. These results highlight the importance of scaffolds with high Fsp3 values in promoting nonaggregation. Matched molecular pair analysis (MMPA) has also deciphered context-dependent substitutions, which can be used to design nonaggregator molecules. We found that most matched molecular pairs have a neutral effect on aggregation propensity. We have prospectively applied our predictive models to assist in chemical library triage for optimal plate selection diversity and purchase for high throughput screening (HTS) in drug discovery projects.