Abstract
Advanced neural network models generally implement systems as multiple layers to model complex functions and capture complicated linguistic structures at different levels [1]. However, only the top layers of deep networks are leveraged in the subsequent process, which misses the opportunity to exploit the useful information embedded in other layers. In this work, we propose to expose all of these embedded signals with two types of mechanisms, namely deep connections and iterative routings. While deep connections allow better information and gradient flow across layers, iterative routings directly combine the layer representations to form a final output with iterative routing-by-agreement mechanism. Experimental results on both machine translation and language representation tasks demonstrate the effectiveness and universality of the proposed approaches, which indicates the necessity of exploiting deep representations for natural language processing tasks. While the two strategies individually boost performance, combining them can further improve performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.