With the help of computer-assisted systems, cartoon synthesis has become convenient and efficient by reusing existing cartoon materials. However, quality evaluation of the obtained cartoon image still relies on labor-intensive subjective judgment. This accordingly raises an urgent demand for effective quality evaluation methods to automatically select a cartoon image of high quality from a set of candidates with different parameter settings. In this paper, a new blind image quality assessment metric is developed for evaluating the perceptual quality of cartoon images by considering structure and chromatic distortions. The extracted gradient-based local structure features and multiscale chromatic statistical features are integrated into one representation for an overall perceptual quality prediction. Experimental results on two benchmark cartoon image datasets, i.e., NBU-CIQAD and HFUT-CID, indicate that the proposed metric outperforms both the state-of-the-art blind quality evaluation methods designed for natural images or synthetic images.