For a nationwide logistics transportation system, it is critical to make the vehicle loading plans (i.e., given many packages, deciding vehicle types and numbers) at each sorting and distribution center. This task is currently completed by dispatchers at each center in many logistics companies and consumes a lot of workloads for dispatchers. Existing works formulate such an issue as a cargo loading problem and solve it by combinatorial optimization methods. However, it cannot work in some real-world nationwide applications due to the lack of accurate cargo volume information and effective model design under complicated impact factors as well as temporal correlation. In this paper, we explore a new opportunity to utilize large-scale route and human behavior data (i.e., dispatchers' decision process on planning vehicles) to generate vehicle loading plans (i.e., plans). Specifically, we collect a five-month nationwide operational dataset from JD Logistics in China and comprehensively analyze human behaviors. Based on the data-driven analytics insights, we design a <u>Ve</u>hicle <u>L</u>oading <u>P</u>lan learning model, named VeLP, which consists of a pattern mining module and a deep temporal cross neural network, to learn the human behaviors on regular and irregular routes, respectively. Extensive experiments demonstrate the superiority of VeLP, which achieves performance improvement by 35.8% and 50% for trunk and branch routes compared with baselines, respectively. Besides, we deployed VeLP in JDL and applied it in about 400 routes, reducing the time by approximately 20% in creating plans. It saves significant human workload and improves operational efficiency for the logistics company.