Attractor networks are an influential theory for memory storage in brain systems. This theory has recently been challenged by the observation of strong temporal variability in neuronal recordings during memory tasks. In this work, we study a sparsely connected attractor network where memories are learned according to a Hebbian synaptic plasticity rule. After recapitulating known results for the continuous, sparsely connected Hopfield model, we investigate a model in which new memories are learned continuously and old memories are forgotten, using an online synaptic plasticity rule. We show that for a forgetting timescale that optimizes storage capacity, the qualitative features of the network’s memory retrieval dynamics are age dependent: most recent memories are retrieved as fixed-point attractors while older memories are retrieved as chaotic attractors characterized by strong heterogeneity and temporal fluctuations. Therefore, fixed-point and chaotic attractors coexist in the network phase space. The network presents a continuum of statistically distinguishable memory states, where chaotic fluctuations appear abruptly above a critical age and then increase gradually until the memory disappears. We develop a dynamical mean field theory to analyze the age-dependent dynamics and compare the theory with simulations of large networks. We compute the optimal forgetting timescale for which the number of stored memories is maximized. We found that the maximum age at which memories can be retrieved is given by an instability at which old memories destabilize and the network converges instead to a more recent one. Our numerical simulations show that a high degree of sparsity is necessary for the dynamical mean field theory to accurately predict the network capacity. To test the robustness and biological plausibility of our results, we study numerically the dynamics of a network with learning rules and transfer function inferred from in vivo data in the online learning scenario. We found that all aspects of the network’s dynamics characterized analytically in the simpler model also hold in this model. These results are highly robust to noise. Finally, our theory provides specific predictions for delay response tasks with aging memoranda. In particular, it predicts a higher degree of temporal fluctuations in retrieval states associated with older memories, and it also predicts fluctuations should be faster in older memories. Overall, our theory of attractor networks that continuously learn new information at the price of forgetting old memories can account for the observed diversity of retrieval states in the cortex, and in particular, the strong temporal fluctuations of cortical activity.1 MoreReceived 3 December 2021Revised 31 August 2022Accepted 6 December 2022DOI:https://doi.org/10.1103/PhysRevX.13.011009Published by the American Physical Society under the terms of the Creative Commons Attribution 4.0 International license. Further distribution of this work must maintain attribution to the author(s) and the published article’s title, journal citation, and DOI.Published by the American Physical SocietyPhysics Subject Headings (PhySH)Research AreasLearningMemoryNeuronal networksPhysical SystemsArtificial neural networksTechniquesDynamical mean field theoryInterdisciplinary PhysicsNetworksStatistical PhysicsNonlinear DynamicsBiological Physics