Cooling load is the main energy consumption factor for commercial buildings during summer. This paper proposes a new approach to realize short-term building cooling load prediction by combining adaptive feature extraction and deep learning models. An autonomous kernel-based neural network (AKNN) is proposed to extract intrinsic patterns from sensor data by clustering kernels in a bottom-up way. Furthermore, a recurrent neural network-based sequence-to-sequence structure is proposed to make multi-step-ahead forecasting. Tested on an industrial park, the results indicate that the proposed model achieves the lowest RMSE of 1.79 MW for 1-hour-ahead prediction. For 24-hour-ahead predictions, the proposed model maintains superior accuracy with an RMSE of 2.85 MW, while ARIMA has an RMSE of 7.27 MW. Additionally, using a public dataset, the proposed model shows an 18.02% lower RMSE than ARIMA for 24-hour prediction, verifying its accurate and stable performance.