The relationship between urban form and localized air temperature has been studied extensively using linear regression. However, the findings remain inconsistent, and few studies have explored alternative data modeling techniques. With the rise of machine learning, there is an opportunity to explore new methods in urban climatology research. This study aims to test the hypothesis that machine learning models, rather than linear regression, can better explain and predict urban air temperature fluctuations in space and time. Measurements of air temperature were conducted at street level in Hong Kong. Urban form characteristics surrounding the measurement locations were extracted from geo-spatial databases and street view imagery. The datasets were then used to test the performance of linear regression, Artificial Neural Network (ANN), and Random Forest (RF) in predicting the spatial-temporal temperature fluctuations. The results indicate that the relationships between urban form and air temperature are predominantly non-linear. Both ANN and RF outperformed linear regression in prediction, with an MAE of 0.43 °C and 0.33 °C respectively. This study highlights the potentials of machine learning models in advancing knowledge of the impact of urban form on localized air temperature.