The latest neural machine translation automatic evaluation method uses pre-trained context word vectors to extract semantic features and directly concatenates them into the neural network to predict translation quality. However, the direct operation can easily lead to a lack of interaction between features, and the layer-by-layer prediction is prone to losing fine-grained matching information. To address these issues, we propose a multi-granularity interactive fusion English translation automatic evaluation, which introduces middle and late information fusion methods. First, we use a bilinear attention distribution to capture high-order cross language feature interactions. By stacking multiple high-order interaction blocks and equipping them with an index linear unit without parameters for middle fusion in a parameter-free manner. Second, we use fine-grained accurate matching sentence shift distance and sentence-level cosine similarity for late fusion. The experimental results on the WMT’21 Metrics Task benchmark dataset show that the proposed method can effectively improve its correlation with human evaluation and achieve comparable performance with the best participating system.
Read full abstract