Online shopping has become a crucial way to encourage daily consumption, where the User-generated, or crowdsourced product comments, can offer a broad range of feedback on e-commerce products. As a result, integrating critical opinions or major attitudes from the crowdsourced comments can provide valuable feedback for marketing strategy adjustment or product-quality monitoring. Unfortunately, the scarcity of annotated ground truth on the integrated comment, or the limited gold integration reference, has incurred the infeasibility of the regular supervised-learning-based comment integration. To resolve this problem, in this article, inspired by the principle of Transfer Learning, we propose a three-stage transferable and generative crowdsourced comment integration framework ( TTGCIF ) based on zero-and-few-shot learning with the support of domain distribution alignment. The proposed framework aims at generating abstractive integrated comment in target domain via the enhanced neural text generation model, by referring the available integration resource in related source domains, to avoid the exhausted effort on resource annotation devoted to the target domain. Specifically, at the first stage, to enhance the domain transferability, representations on the crowdsourced comments have been aligned up between the source and target domain, by minimizing the domain distribution discrepancy in the kernel space. At the second stage, Zero-shot comment integration mechanism has been adopted to deal with the dilemma that none of the gold integration reference may be available in target domain. In other words, taking the sample-level semantic prototype as input, the enhanced neural text generation model in TTGCIF is trained to learn data semantic association among different domains via semantic prototype transduction, so that the “ unlabeled ” crowdsourced comments in target domain can be associated with existing integration references in related source domains. At the third stage, based on the parameters trained at the second stage, fast domain adaptation mechanism in a Few-shot manner has also been adopted by seeking most potential parameters along the gradient direction constrained by instances across multiple source domains. In this way, parameters in TTGCIF can be sensitive to any alteration on training data, ensuring that even if only few annotated resource in target domain are available for “Fine-tune,” TTGCIF can still react promptly to achieve effective target domain adaptation. According to the experimental results, TTGCIF can achieve the best transferable product comment integration performance in target domain, with fast and stable domain adaption effect depending on no more than 10% annotated resource in target domain. More importantly, even if TTGCIF has not been fine-tuned on the target domain, yet by referring to the available integration resource in related source domains, the integrated comments generated by TTGCIF on the target domain are still superior to those generated by models already fine-tuned on the target domain.
Read full abstract