The advancement of geological big data has rendered data-driven methodologies increasingly vital in Mineral Prospectivity Mapping. The effective integration of quantitative and qualitative data, including experiential and knowledge-based insights, is crucial in geological data fusion. Specifically, the conversion of raw data into samples and the selection of predictive methods are two core issues that constitute the focus of this study. Traditional clustering methods require the user to specify the number of clusters in advance. The two-step clustering can automatically determine the clustering result ‘k’ while analyzing both continuous and categorical variables, by building a Cluster Feature (CF) and using information criteria to merge nodes. In this study, we conducted an analysis utilizing stream sediment element data, residual gravity anomalies, and fault distribution through the two-step clustering method. Factor analysis (FA) was employed to reduce 16 elemental variables from stream sediments into five uncorrelated continuous variables; additionally, residual gravity anomalies were transformed from continuous to categorical variables via an interval-based method before being combined with fault distribution, resulting in seven variables for clustering. The research findings indicate that categorical variables significantly influence clustering results; concurrently, as the importance of continuous variables within the cluster increases, so does k. When only one categorical variable is present, residual gravity anomalies show significantly better clustering than fault distribution; however, when two categorical variables are involved, it is essential to consider the quantity of categories: more categories lead to poorer quality. The results from the Jiaolai Basin’s northeastern margin indicate a significant correlation with known gold deposits; two-step clustering is a promising and effective method for improving mineral prospecting efforts.