Abstract
Deep residual learning (ResNet) is a new method for training very deep neural networks using identity mapping for shortcut connections. ResNet has won the ImageNet ILSVRC 2015 classification task, and achieved state-of-the-art performances in many computer vision tasks. However, the effect of residual learning on noisy natural language processing tasks is still not well understood. In this paper, we design a novel convolutional neural network (CNN) with residual learning, and investigate its impacts on the task of distantly supervised noisy relation extraction. In contradictory to popular beliefs that ResNet only works well for very deep networks, we found that even with 9 layers of CNNs, using identity mapping could significantly improve the performance for distantly-supervised relation extraction.
Highlights
Relation extraction is the task of predicting attributes and relations for entities in a sentence (Zelenko et al, 2003; Bunescu and Mooney, 2005; GuoDong et al, 2005)
We investigate the effects of training deeper convolutional neural network (CNN) for distantly-supervised relation extraction
In contrast to popular beliefs in vision that deep residual network only works for very deep CNNs, we show that even with a moderately deep CNNs, there are substantial improvements over vanilla CNNs for relation extraction
Summary
Relation extraction is the task of predicting attributes and relations for entities in a sentence (Zelenko et al, 2003; Bunescu and Mooney, 2005; GuoDong et al, 2005). Among all the machine learning approaches for distant supervision, the recently proposed Convolutional Neural Networks (CNNs) model (Zeng et al, 2014) achieved the state-of-the-art performance. Following their success, Zeng et al (2015) proposed a piece-wise max-pooling strategy to improve the CNNs. Various attention strategies (Lin et al, 2016; Shen and Huang, 2016) for CNNs are proposed, obtaining impressive results. We show that our deep residual network model outperforms CNNs by a large margin empirically, obtaining state-of-the-art performances;. Our identity mapping with shortcut feedback approach can be applicable to any variants of CNNs for relation extraction
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.