Abstract

Learning hidden topics from short text streams has become crucial in modern applications such as social networks, instant messages, question and answer forums, etc. Building an effective learning method poses two main challenges: Short and noisy data, as well as the stability-plasticity dilemma. In this paper, we investigate carefully how existing methods face these challenges. From our theoretical and empirical analyses, they often deal well with a challenge but ineffectively handle the other. In particular, they often suffer from catastrophic forgetting, because they impose a constraint on the learned knowledge from only the previous time step in the streaming duration. In this paper, we propose a novel method, namely BSP, which has a regularization term based on second-order Taylor expansion to accumulate information from all former minibatches. Moreover, external knowledge and Dropout technique will be combined at each time step to handle better short and noisy texts as well as enhance the model’s plasticity. We empirically evaluate BSP, compared to other state-of-the-art streaming methods in terms of dealing with stability-plasticity dilemma and handling short and noisy texts. The extensive experiments show superior effectiveness of BSP.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call