This paper proposes two hard drive failure prediction models based on Decision Trees (DTs) and Gradient Boosted Regression Trees (GBRTs) which perform well in prediction performance as well as stability and interpretability. The models are evaluated on a real-world dataset containing 121,698 drives in total. Experimental results show the DT model predicts over 93% of failures at a false alarm rate under 0.01%, and the GBRT model can achieve about 90% failure detection rate without any false alarms. Moreover, the GBRT model evaluates drive health (or fault probability) which provides a quantitative indicator of failure urgency. This enables operators to allocate system resources accordingly for pre-warning migrations while maintaining the quality of user services.Aiming at practical application of prediction models, we test the models on another real-world dataset with different drive models, on a real-world hybrid dataset with multiple drive models, and on several datasets containing fewer drives. Both prediction models show steady prediction performance, with high failure detection rates (80% to 96%) and low false alarm rates (0.006% to 0.31%). We also implement a reliability model for RAID-6 systems with proactive fault tolerance and show that the proposed models can significantly improve the reliability and/or reduce construction and maintenance cost of large-scale storage systems.