Abstract
Neural networks need the right representations of input data to learn. Here we ask how gradient-based learning shapes a fundamental property of representations in recurrent neural networks (RNNs)—their dimensionality. Through simulations and mathematical analysis, we show how gradient descent can lead RNNs to compress the dimensionality of their representations in a way that matches task demands during training while supporting generalization to unseen examples. This can require an expansion of dimensionality in early timesteps and compression in later ones, and strongly chaotic RNNs appear particularly adept at learning this balance. Beyond helping to elucidate the power of appropriately initialized artificial RNNs, this fact has implications for neurobiology as well. Neural circuits in the brain reveal both high variability associated with chaos and low-dimensional dynamical structures. Taken together, our findings show how simple gradient-based learning rules lead neural networks to solve tasks with robust representations that generalize to new cases.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.