Abstract

It is crucial for language models to model long-term dependency in word sequences, which can be achieved to some good extent by recurrent neural network (RNN) based language models with long short-term memory (LSTM) units. To accurately model the sophisticated long-term information in human languages, large memory in language models is necessary. However, the size of RNN-based language models cannot be arbitrarily increased because the computational resources required and the model complexity will also be increase accordingly, due to the limitation of the structure. To overcome this problem, inspired from Neural Turing Machine and Memory Network, we equip RNN-based language models with controllable external memory. With a learnable memory controller, the size of the external memory is independent to the number of model parameters, so the proposed language model can have larger memory without increasing the parameters. In the experiments, the proposed model yielded lower perplexities than RNN-based language models with LSTM units on both English and Chinese corpora.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.