An Efficient Transformer Decoder with Compressed Sub-layers

Yanyang Li,Tong Xiao,Jingbo Zhu,Ye Lin

doi:10.1609/aaai.v35i15.17572

An Efficient Transformer Decoder with Compressed Sub-layers

Yanyang Li, Tong Xiao + Show 2 more

Open Access

https://doi.org/10.1609/aaai.v35i15.17572

Copy DOI

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: May 18, 2021
Citations: 13

Affiliation: Northeastern University

#Strong Baseline #Standard Baseline + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

The large attention-based encoder-decoder network (Transformer) has become prevailing recently due to its effectiveness. But the high computation complexity of its decoder raises the inefficiency issue. By examining the mathematic formulation of the decoder, we show that under some mild conditions, the architecture could be simplified by compressing its sub-layers, the basic building block of Transformer, and achieves a higher parallelism. We thereby propose Compressed Attention Network, whose decoder layer consists of only one sub-layer instead of three. Extensive experiments on 14 WMT machine translation tasks show that our model is 1.42x faster with performance on par with a strong baseline. This strong baseline is already 2x faster than the widely used standard baseline without loss in performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.