A vehicle make and model recognition (VMMR) system is a common requirement in the field of intelligent transportation systems (ITS). However, it is a challenging task because of the subtle differences between vehicle categories. In this paper, we propose a hierarchical scheme for VMMR. Specifically, the scheme consists of (1) a feature extraction framework called weighted mask hierarchical bilinear pooling (WMHBP) based on hierarchical bilinear pooling (HBP) which weakens the influence of invalid background regions by generating a weighted mask while extracting features from discriminative regions to form a more robust feature descriptor; (2) a hierarchical loss function that can learn the appearance differences between vehicle brands, and enhance vehicle recognition accuracy; (3) collection of vehicle images from the Internet and classification of images with hierarchical labels to augment data for solving the problem of insufficient data and low picture resolution and improving the model’s generalization ability and robustness. We evaluate the proposed framework for accuracy and real-time performance and the experiment results indicate a recognition accuracy of 95.1% and an FPS (frames per second) of 107 for the framework for the Stanford Cars public dataset, which demonstrates the superiority of the method and its availability for ITS.