Cloud mask is one of the most essential products for satellite remote sensing and downstream applications. This study develops machine learning-based (ML-based) cloud detection algorithms using only spectral observations for the Advanced Himawari Imager (AHI) onboard the Himawari-8 geostationary satellite. Collocated active observations from Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) are used to provide reference labels for model development and validation. We develop both daytime and nighttime algorithms that are differed by whether solar band observations are included, and the artificial neural network (ANN) and random forest (RF) techniques are adopted for comparison. Specially, to eliminate the influences of surface conditions on cloud detection, we introduce three models with different treatments on the surface, and, instead of developing independent ML-based algorithms, adding surface variable in an appropriate way may enhance the ML-based algorithm accuracy by ~5%. Validated against CALIOP observations, we do find that the current AHI cloud mask may overestimate clouds, and our daytime RF-based algorithm outperforms the AHI operational algorithm by improving the accuracy of cloudy pixel detection for ~5% and reducing the misjudgment for ~3%. The nighttime model with only infrared observations is also slightly better than the AHI operational product, but may overestimate cloudy pixels as well. Overall, our ML-based algorithms can serve as a reliable method to provide all time cloud mask results for AHI observations, and the surface is suggested to be treated as independent variables for future ML-based algorithm development.