In recent years, face parsing and facial expression recognition have attracted increasing interest. Even though there are relevant results about face parsing and face representation, these approaches seek accuracy at the expense of speed. In this paper, we design a novel multi-task learning network for face parsing and facial expression recognition (MPENet). Specifically, MPENet consists of shared encoders and three downstream branches. In the edge perceiving branch, we use category edge and binary edge to extract face boundary information and improve localization of face boundaries. In the segmentation branch, we use graph learning to fuse edge and semantic information of the image, analyze the relations between different feature regions, and capture more contextual relationships. Finally, we design a consistent learning loss function, forcing different branches to learn the same predictions. We have carried out experiments on face datasets, and found that it shows high precision and fast inference speed. Specifically, MPENet achieves F1 scores of 85.9 on the CelebAMask-HQ dataset and 92.9 on the Lapa dataset, with an inference speed of 92.9 FPS. Moreover, MPENet precisely delineates the semantic boundaries of facial regions and, through consistent multi-task learning, effectively facilitates synergy among various tasks.