Crowdsourcing provides a fast and low-cost solution to collect annotations for training data in computer vision. However, there are two challenges in crowdsourced image annotation: First, when crowdsourced workers perform annotation tasks in an unfamiliar domain, their accuracy will dramatically decline due to the lack of expertise; Second, the difficulties of tasks may be different due to the noises in images, which is only related to the features of images themselves and will affect the judgment of workers. It is well known that transferring knowledge from relevant domains can form a better representation for training samples, which benefits the estimation of workers’ expertise in truth inference models. However, the existing knowledge transfer processes for crowdsourcing require a considerable number of well-collected samples in source domains. Comprehensively considering the above issues, this paper proposes a novel probabilistic model for crowdsourcing truth inference, which fuses few-shot meta-learning and transfer learning. The proposed model transfers meta-knowledge from the source domain to form better high-level representations of the instances in the target domain. Simultaneously utilizing both high-level representations and instance features, the quality of workers and the difficulty of instances can be better modeled and inferred. Experimental results on a number of datasets show that the proposed model not only outperforms the state-of-the-art models but also significantly reduces the number of instances required in the source domain.