Buses have become indispensable in urban transportation systems, especially in developing countries. The prediction of bus travel time can provide essential information for passengers to coordinate their trip plans. By combining several prediction algorithms, ensemble learning methods have shown great potential for improving prediction accuracy in many research fields. In this article, ensemble learning methods are used to predict bus travel times. First, a novel feature-selection algorithm, Boruta, is introduced to select the appropriate input features for predicting bus travel time. The algorithm can quantify the importance of each feature and identify those that are important for the prediction. Second, we illustrate the ensemble learning methods in detail, including the bagging, boosting, and stacking methods. The representative algorithm of each category of methods is presented and utilized to study the prediction problem. Finally, a case study is conducted based on real-world data. Twenty original features are analyzed using the Boruta algorithm, and two are filtered out. Besides the ensemble learning algorithms, we also choose some other classical algorithms to predict the bus travel time. The results show that the boosting and stacking algorithms outperform other algorithms in terms of prediction accuracies.