Understanding and predicting floor failure depth is crucial for both mitigating mine water inrush hazards and safeguarding groundwater resources. Mining activities can significantly disturb the geological strata, leading to shifts and damage that may result in floor cracks. These disruptions can extend to confined aquifers, thereby increasing the risk of water inrushes. Such events not only pose a threat to the safety of mining operations but also jeopardize the sustainability of surrounding groundwater systems. Therefore, accurately predicting floor failure depth to take effective coal seam floor management measures is the key to reducing the impact of coal seam mining on water resources. Seventy-eight sets of data on coal seam floor failure depth in China were collected, and the main controlling factors were considered: mining depth (D1), working face inclination length (D2), coal seam inclination (D3), and mining thickness (D4). Firstly, the distance evaluation function based on Euclidean distance was constructed as the clustering effectiveness index, and the optimal cluster number K = 3 was determined. The collected data were clustered into three categories using the K-means clustering algorithm. It was found that the clustering results were positively correlated with the size of D1, indicating that D1 played a dominant role in the clustering. The D1 dividing points of the three types of samples were between 407.7~414.9 m and 750~900 m. On this basis, the grey correlation analysis method was used to analyze the order of the influence weights of the main controlling factors of coal seam floor failure depth. For the first group, the order was D2 > D1 > D3 > D4, while, in the other two, it was D1 > D2 > D3 > D4. D1 emerged as the most influential factor, surpassing D2. Therefore, D1 between 407.7 and 414.9 m could be used as the boundary, the first group could be classified as shallow mining, and the second and third groups could be classified as deep mining. Based on this boundary, CatBoost prediction models for the depth of coal seam floor failure in deep and shallow parts were constructed and the prediction results of the model test set were compared with the calculation results of the empirical formula. These models exhibited superior accuracy with a lower mean squared error (MSE) and mean absolute error (MAE) and a higher R-squared (R2) compared to the empirical formula. This study helps to enhance the understanding of coal seam floor behavior, guide floor management, and protect groundwater resources by defining deep and shallow mining to accurately predict floor failure depth.