Abstract

Sequence labeling, such as part-of-speech (POS) tagging, named entity recognition (NER), text chunking, is a classic task in natural language processing. Most existing neural networks models for sequence labeling are based on recurrent neural networks. Recently, convolutional neural networks have been proposed to replace the recurrent components for sequence labeling. However, they are usually shallow compared to deep convolutional networks that achieve start-of-the-art performance in other fields. Due to the vanishing gradient problem, these models usually can not work well when simply increasing the number of layers. In this paper, we propose using deep CNN architecture in sequence labeling, which can capture a large context through stacked convolutions. To reduce the vanishing gradient problem, the proposed method incorporates gated linear units, residual connections, and dense connections. Experimental results on three sequence labeling tasks show that the proposed model can achieve competitive performance to the RNN-based state-of-the-art method while maintaining $2.41\times$ faster speed, even with up to 10 convolutional layers.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.