Information aggregation is an essential component of text encoding, but it has been paid less attention. The pooling-based (max or average pooling) aggregation method is a bottom-up and passive aggregation method, and loses a lot of important information. Recently, attention mechanism and dynamic routing policy are separately used to aggregate information, but their aggregation capabilities can be further improved. In this paper, we proposed an novel aggregation method combining attention mechanism and dynamic routing, which can strengthen the ability of information aggregation and improve the quality of text encoding. Then, a novel Leaky Natural Logarithm (LNL) squash function is designed to alleviate the “saturation” problem of the squash function of the original dynamic routing. Layer Normalization is added to the dynamic routing policy for speeding up routing convergence as well. A series of experiments are conducted on five text classification benchmarks. Experimental results show that our method outperforms other aggregating methods.
Read full abstract