In this article, we study the task of user profiling in question answering communities (QACs). Previous user profiling algorithms suffer from a number of defects: they regard users and words as atomic units, leading to the mismatch between them; they are designed for other applications but not for QACs; and some semantic profiling algorithms do not co-embed users and words, leading to making the affinity measurement between them difficult. To improve the profiling performance, we propose a neural Flow-based Constrained Co-embedding Model, abbreviated as FCCM. FCCM jointly co-embeds the vector representations of both users and words in QACs such that the affinities between them can be semantically measured. Specifically, FCCM extends the standard variational auto-encoder model to enforce the inferred embeddings of users and words subject to the voting constraint, i.e., given a question and the users who answer this question in the community, representations of the users whose answers receive more votes are closer to the representations of the words associated with these answers, compared with representations of whose receiving fewer votes. In addition, FCCM integrates normalizing flow into the variational auto-encoder framework to avoid the assumption that the distributions of the embeddings are Gaussian, making the inferred embeddings fit the real distributions of the data better. Experimental results on a Chinese Zhihu question answering dataset demonstrate the effectiveness of our proposed FCCM model for the task of user profiling in QACs.
Read full abstract