Carotenoid aggregates are omnipresent in natural world and can be synthesized in hydrophilic environments. Despite different types of carotenoid aggregates have been reported hitherto, the way to predict the formation of carotenoid aggregates, i.e. H- or J-aggregates, is still challenging. Here, for the first time, we established machine learning models that can predict the formation behavior of carotenoid aggregates. The models are trained based on a database containing different types of carotenoid aggregates reported in the literatures. With the help of these machine learning models, we found a series of unknown types of β-carotene J-aggregates. These novel aggregates are ultra-weakly coupled and have absorption bands up to 700 nm, different from all the carotenoid aggregates reported previously. Our work demonstrates that the machine learning is a powerful tool to predict the formation behavior of carotenoid aggregates and can further lead into the discovery of new carotenoid aggregates for different applications.