Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Export
Sort by: Relevance
  • New
  • Research Article
  • 10.1162/neco.a.36
Working Memory and Self-Directed Inner Speech Enhance Multitask Generalization in Active Inference.
  • Oct 29, 2025
  • Neural computation
  • Jeffrey Frederic Queißer + 1 more

This simulation study shows how a set of working memory tasks can be acquired simultaneously through interaction between a stacked recurrent neural network (RNN) and multiple working memories. In these tasks, temporal patterns are provided, followed by linguistically specified task goals. Training is performed in a supervised manner by minimizing the free energy, and goal-directed tasks are performed using the active inference (AIF) framework. Our simulation results show that the best task performance is obtained when two working memory modules are used instead of one or none and when self-directed inner speech is incorporated during task execution. Detailed analysis indicates that a temporal hierarchy develops in the stacked RNN module under these optimal conditions. We argue that the model's capacity for generalization across novel task configurations is supported by the structured interplay between working memory and the generation of self-directed language outputs during task execution. This interplay promotes internal representations that reflect task structure, which in turn support generalization by enabling a functional separation between content encoding and control dynamics within the memory architecture.

  • New
  • Research Article
  • 10.1162/neco.a.37
Model Predictive Control on the Neural Manifold.
  • Oct 29, 2025
  • Neural computation
  • Christof Fehrman + 1 more

Neural manifolds are an attractive theoretical framework for characterizing the complex behaviors of neural populations. However, many of the tools for identifying these low-dimensional subspaces are correlational and provide limited insight into the underlying dynamics. The ability to precisely control the latent activity of a circuit would allow researchers to investigate the structure and function of neural manifolds. We simulate controlling the latent dynamics of a neural population using closed-loop, dynamically generated sensory inputs. Using a spiking neural network (SNN) as a model of a neural circuit, we find low-dimensional representations of both the network activity (the neural manifold) and a set of salient visual stimuli. The fields of classical and optimal control offer a range of methods to choose from for controlling dynamics on the neural manifold, which differ in performance, computational cost, and ease of implementation. Here, we focus on two commonly used control methods: proportional-integral-derivative (PID) control and model predictive control (MPC). PID is a computationally lightweight controller that is simple to implement. In contrast, MPC is a model-based, anticipatory controller with a much higher computational cost and engineering overhead. We evaluate both methods on trajectory-following tasks in latent space, under partial observability and in the presence of unknown noise. While both controllers in some cases were able to successfully control the latent dynamics on the neural manifold, MPC consistently produced more accurate control and required less hyperparameter tuning. These results demonstrate how MPC can be applied on the neural manifold using data-driven dynamics models and provide a framework to experimentally test for causal relationships between manifold dynamics and external stimuli.

  • New
  • Research Article
  • 10.1162/neco.a.38
Unsupervised Learning in Echo State Networks for Input Reconstruction.
  • Oct 29, 2025
  • Neural computation
  • Taiki Yamada + 2 more

Echo state networks (ESNs) are a class of recurrent neural networks in which only the readout layer is trainable, while the recurrent and input layers are fixed. This architectural constraint enables computationally efficient processing of time-series data. Traditionally, the readout layer in ESNs is trained using supervised learning with target outputs. In this study, we focus on input reconstruction (IR), where the readout layer is trained to reconstruct the input time series fed into the ESN. We show that IR can be achieved through unsupervised learning (UL), without access to supervised targets, provided that the ESN parameters are known a priori and satisfy invertibility conditions. This formulation allows applications relying on IR, such as dynamical system replication and noise filtering, to be reformulated within the UL framework via straightforward integration with existing algorithms. Our results suggest that prior knowledge of ESN parameters can reduce reliance on supervision, thereby establishing a new principle-not only by fixing part of the network parameters but also by exploiting their specific values. Furthermore, our UL-based algorithms for input reconstruction and related tasks are suitable for autonomous processing, offering insights into how analogous computational mechanisms might operate in the brain in principle. These findings contribute to a deeper understanding of the mathematical foundations of ESNs and their relevance to models in computational neuroscience.

  • Research Article
  • 10.1162/neco.a.27
Feature Normalization Prevents Collapse of Noncontrastive Learning Dynamics.
  • Oct 10, 2025
  • Neural computation
  • Han Bao

Contrastive learning is a self-supervised representation learning framework where two positive views generated through data augmentation are made similar by an attraction force in a data representation space, while a repulsive force makes them far from negative examples. Noncontrastive learning, represented by BYOL and SimSiam, gets rid of negative examples and improves computational efficiency. While learned representations may collapse into a single point due to the lack of the repulsive force at first sight, Tian etal. (2021) revealed through learning dynamics analysis that the representations can avoid collapse if data augmentation is sufficiently stronger than regularization. However, their analysis does not take into account commonly used feature normalization, a normalizer before measuring the similarity of representations, and hence excessively strong regularization may still collapse the dynamics, an unnatural behavior under the presence of feature normalization. Therefore, we extend the previous theory based on the L2 loss by considering the cosine loss instead, which involves feature normalization. We show that the cosine loss induces sixth-order dynamics (while the L2 loss induces a third-order one), in which a stable equilibrium dynamically emerges even if there are only collapsed solutions with given initial parameters. Thus, we offer a new understanding that feature normalization plays an important role in robustly preventing the dynamics collapse.

  • Research Article
  • 10.1162/neco.a.35
Modeling Higher-Order Interactions in Sparse and Heavy-Tailed Neural Population Activity.
  • Oct 10, 2025
  • Neural computation
  • Ulises Rodríguez-Domínguez + 1 more

Neurons process sensory stimuli efficiently, showing sparse yet highly variable ensemble spiking activity involving structured higher-order interactions. Notably, while neural populations are mostly silent, they occasionally exhibit highly synchronous activity, resulting in sparse and heavy-tailed spike-count distributions. However, its mechanistic origin-specifically, what types of nonlinear properties in individual neurons induce such population-level patterns-remains unclear. In this study, we derive sufficient conditions under which the joint activity of homogeneous binary neurons generates sparse and widespread population firing rate distributions in infinitely large networks. We then propose a subclass of exponential family distributions that satisfy this condition. This class incorporates structured higher-order interactions with alternating signs and shrinking magnitudes, along with a base-measure function that offsets distributional concentration, giving rise to parameter-dependent sparsity and heavy-tailed population firing rate distributions. Analysis of recurrent neural networks that recapitulate these distributions reveals that individual neurons possess threshold-like nonlinearity, followed by supralinear activation that jointly facilitates sparse and synchronous population activity. These nonlinear features resemble those in modern Hopfield networks, suggesting a connection between widespread population activity and the network's memory capacity. The theory establishes sparse and heavy-tailed distributions for binary patterns, forming a foundation for developing energy-efficient spike-based learning machines.

  • Research Article
  • 10.1162/neco.a.34
A Chimera Model for Motion Anticipation in the Retina and the Primary Visual Cortex.
  • Oct 10, 2025
  • Neural computation
  • Jérôme Emonet + 5 more

We propose a mean field model of the primary visual cortex (V1), connected to a realistic retina model, to study the impact of the retina on motion anticipation. We first consider the case where the retina does not itself provide anticipation-which is then only triggered by a cortical mechanism, the "anticipation by latency"-and unravel the effects of the retinal input amplitude, of stimulus features such as speed and contrast and of the size of cortical extensions and fiber conduction speed. Then we explore the changes in the cortical wave of anticipation when V1 is triggered by retina-driven anticipatory mechanisms: gain control and lateral inhibition by amacrine cells. Here, we show how retinal and cortical anticipation combine to provide an efficient processing where the simulated cortical response is in advance over the moving object that triggers this response, compensating the delays in visual processing.

  • Research Article
  • 10.1162/neco.a.30
Encoding of Numerosity With Robustness to Object and Scene Identity in Biologically Inspired Object Recognition Networks.
  • Oct 10, 2025
  • Neural computation
  • Thomas Chapalain + 2 more

Number sense, the ability to rapidly estimate object quantities in a visual scene without precise counting, is a crucial cognitive capacity found in humans and many other animals. Recent studies have identified artificial neurons tuned to numbers of items in biologically inspired vision models, even before training, and proposed these artificial neural networks as candidate models for the emergence of number sense in the brain. But real-world numerosity perception requires abstraction from the properties of individual objects and their contexts, unlike the simplified dot patterns used in previous studies. Using novel synthetically generated photorealistic stimuli, we show that deep convolutional neural networks optimized for object recognition encode information on approximate numerosity across diverse objects and scene types, which could be linearly read out from distributed activity patterns of later convolutional layers of different network architectures tested. In contrast, untrained networks with random weights failed to represent numerosity with abstractness to other visual properties and instead captured mainly low-level visual features. Our findings emphasize the importance of using complex, naturalistic stimuli to investigate mechanisms of number sense in both biological and artificial systems, and they suggest that the capacity of untrained networks to account for early-life numerical abilities should be reassessed. They further point to a possible, so far underappreciated, contribution of the brain's ventral visual pathway to representing numerosity with abstractness to other high-level visual properties.

  • Research Article
  • 10.1162/neco.a.28
Firing Rate Models as Associative Memory: Synaptic Design for Robust Retrieval.
  • Sep 22, 2025
  • Neural computation
  • Simone Betteti + 3 more

Firing rate models are dynamical systems widely used in applied and theoretical neuroscience to describe local cortical dynamics in neuronal populations. By providing a macroscopic perspective of neuronal activity, these models are essential for investigating oscillatory phenomena, chaotic behavior, and associative memory processes. Despite their widespread use, the application of firing rate models to associative memory networks has received limited mathematical exploration, and most existing studies are focused on specific models. Conversely, well-established associative memory designs, such as Hopfield networks, lack key biologically relevant features intrinsic to firing rate models, including positivity and interpretable synaptic matrices reflecting the action of long-term potentiation and long-term depression. To address this gap, we propose a general framework that ensures the emergence of rescaled memory patterns as stable equilibria in the firing rate dynamics. Furthermore, we analyze the conditions under which the memories are locally and globally asymptotically stable, providing insights into constructing biologically plausible and robust systems for associative memory retrieval.

  • Research Article
  • 10.1162/neco.a.29
Transformer Models for Signal Processing: Scaled Dot-Product Attention Implements Constrained Filtering.
  • Sep 22, 2025
  • Neural computation
  • Terence D Sanger

The remarkable success of the transformer machine learning architecture for processing language sequences far exceeds the performance of classical signal processing methods. A unique component of transformer models is the scaled dot-product attention (SDPA) layer, which does not appear to have an analog in prior signal processing algorithms. Here, we show that SDPA operates using a novel principle that projects the current state estimate onto the space spanned by prior estimates. We show that SDPA, when used for causal recursive state estimation, implements constrained state estimation in circumstances where the constraint is unknown and may be time varying. Since constraints in high-dimensional space may represent the complex relationships that define nonlinear signals and models, this suggests that the SDPA layer and transformer models leverage constrained estimation to achieve their success. This also suggests that transformers and the SPDA layer could be a computational model for previously unexplained capabilities of human behavior.

  • Research Article
  • 10.1162/neco.a.26
Rapid Reweighting of Sensory Inputs and Predictions in Visual Perception.
  • Sep 22, 2025
  • Neural computation
  • William Turner + 3 more

A striking perceptual phenomenon has recently been described wherein people report seeing abrupt jumps in the location of a smoothly moving object ("position resets"). Here, we show that this phenomenon can be understood within the framework of recursive Bayesian estimation as arising from transient gain changes, temporarily prioritizing sensory input over predictive beliefs. From this perspective, position resets reveal a capacity for rapid adaptive precision weighting in human visual perception and offer a possible test bed within which to study the timing and flexibility of sensory gain control.