Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Year Year arrow
arrow-active-down-0
Publisher Publisher arrow
arrow-active-down-1
Journal
1
Journal arrow
arrow-active-down-2
Institution Institution arrow
arrow-active-down-3
Institution Country Institution Country arrow
arrow-active-down-4
Publication Type Publication Type arrow
arrow-active-down-5
Field Of Study Field Of Study arrow
arrow-active-down-6
Topics Topics arrow
arrow-active-down-7
Open Access Open Access arrow
arrow-active-down-8
Language Language arrow
arrow-active-down-9
Filter Icon Filter 1
Export
Sort by: Relevance
  • New
  • Research Article
  • 10.1162/neco.a.1474
Effective Learning Rules as Natural Gradient Descent.
  • Dec 22, 2025
  • Neural computation
  • Lucas Shoji + 2 more

We establish that a broad class of effective learning rules-those that improve a scalar performance measure over a given time window-can be expressed as natural gradient descent with respect to an appropriately defined metric. Specifically, parameter updates in this class can always be written as the product of a symmetric positive-definite matrix and the negative gradient of a loss function encoding the task. Given the high level of generality, our findings formally support the idea that the gradient is a fundamental object underlying all learning processes. Our results are valid across a wide range of common settings, including continuous- time, discrete-time, stochastic, and higher-order learning rules, as well as loss functions with explicit time dependence. Beyond providing a unified framework for learning, our results also have practical implications for control as well as experimental neuroscience.

  • New
  • Research Article
  • 10.1162/neco.a.39
Possible Principles for Aligned Structure Learning Agents.
  • Dec 22, 2025
  • Neural computation
  • Lancelot Da Costa + 7 more

This paper offers a road map for the development of scalable aligned artificial intelligence (AI) from first principle descriptions of natural intelligence. In brief, a possible path toward scalable aligned AI rests on enabling artificial agents to learn a good model of the world that includes a good model of our preferences. For this, the main objective is creating agents that learn to represent the world and other agents' world models, a problem that falls under structure learning (also known as causal representation learning or model discovery). We expose the structure learning and alignment problems with this goal in mind, as well as principles to guide us forward, synthesizing various ideas across mathematics, statistics, and cognitive science. We discuss the essential role of core knowledge, information geometry, and model reduction in structure learning and suggest core structural modules to learn a wide range of naturalistic worlds. We then outline a way toward aligned agents through structure learning and theory of mind. As an illustrative example, we mathematically sketch Asimov's laws of robotics, which prescribe agents to act cautiously to minimize the ill-being of other agents. We supplement this example by proposing refined approaches to alignment. These observations may guide the development of artificial intelligence in helping to scale existing, or design new, aligned structure learning systems.

  • New
  • Research Article
  • 10.1162/neco.a.1475
Neural Associative Skill Memories for Safer Robotics and Modeling Human Sensorimotor Repertoires.
  • Dec 22, 2025
  • Neural computation
  • Pranav Mahajan + 4 more

Modern robots face a challenge shared by biological systems: how to learn and adaptively express multiple sensorimotor skills. A key aspect of this is developing an internal model of expected sensorimotor experiences to detect and react to unexpected events, guiding self-preserving behaviors. Associative skill memories (ASMs) address this by linking movement primitives to sensory feedback, but existing implementations rely on hard-coded libraries of individual skills. A key unresolved problem is how a single neural network can learn a repertoire of skills while enabling integrated fault detection and context-aware execution. Here we introduce neural associative skill memories (neural ASMs), a framework that uses self-supervised temporal predictive coding to integrate skill learning and expression using biologically plausible local learning rules. Unlike traditional ASMs, which require explicit skill selection, neural ASMs implicitly recognize and express skills through contextual inference, enabling fault detection using "predictive surprise" across the entire learned repertoire. Compared to recurrent neural networks trained using backpropagation through time, our model achieves comparable qualitative performance in skill memory expression while using local learning rules and predicts a biologically relevant speed-versus-accuracy trade-off. By integrating fault detection, reactive control, and skill expression into a single energy-based architecture, neural ASMs contribute to safer, self-preserving robotics and provide a computational lens to study biological sensorimotor learning.

  • Research Article
  • 10.1162/neco.a.1482
Sum-of-Norms Regularized Nonnegative Matrix Factorization.
  • Dec 10, 2025
  • Neural computation
  • Andersen Ang + 2 more

When applying nonnegative matrix factorization (NMF), the rank parameter is generally unknown. This rank, called the nonnegative rank, is usually estimated heuristically since computing its exact value is NP-hard. In this work, we propose an approximation method to estimate the rank on the fly while solving NMF. We use the sum-of-norm (SON), a group-lasso structure that encourages pairwise similarity, to reduce the rank of a factor matrix when the initial rank is overestimated. On various data sets, SON-NMF can reveal the correct nonnegative rank of the data without prior knowledge or parameter tuning. SON-NMF is a nonconvex, nonsmooth, nonseparable, and nonproximable problem, making it nontrivial to solve. First, since rank estimation in NMF is NP-hard, the proposed approach does not benefit from lower computational complexity. Using a graph-theoretic argument, we prove that the complexity of SON NMF is essentially irreducible. Second, the per iteration cost of algorithms for SON-NMF can be high. This motivates us to propose a first-order BCD algorithm that approximately solves SON-NMF with low per iteration cost via the proximal average operator. SON-NMF exhibits favorable features for applications. Besides the ability to automatically estimate the rank from data, SON-NMF can handle rank-deficient data matrices and detect weak components with little energy. Furthermore, in hyperspectral imaging, SON-NMF naturally addresses the issue of spectral variability.

  • Research Article
  • 10.1162/neco.a.1480
Simulated Complex Cells Contribute to Object Recognition Through Representational Untangling.
  • Dec 10, 2025
  • Neural computation
  • Mitchell B Slapik + 1 more

The visual system performs a remarkable feat: it takes complex retinal activation patterns and decodes them for object recognition. This operation, termed "representational untangling," organizes neural representations by clustering similar objects together while separating different categories of objects. While representational untangling is usually associated with higher-order visual areas like the inferior temporal cortex, it remains unclear how the early visual system contributes to this process-whether through highly selective neurons or high-dimensional population codes. This article investigates how a computational model of early vision contributes to representational untangling. Using a computational visual hierarchy and two different data sets consisting of numerals and objects, we demonstrate that simulated complex cells significantly contribute to representational untangling for object recognition. Our findings challenge prior theories by showing that untangling does not depend on skewed, sparse, or high-dimensional representations. Instead, simulated complex cells reformat visual information into a low-dimensional, yet more separable, neural code, striking a balance between representational untangling and computational efficiency.

  • Research Article
  • 10.1162/neco.a.1481
Approximation Rates in Fréchet Metrics: Barron Spaces, Paley-Wiener Spaces, and Fourier Multipliers.
  • Dec 10, 2025
  • Neural computation
  • Ahmed Abdeljawad + 1 more

Operator learning is a recent development in the simulation of partial differential equationsby means of neural networks. The idea behind this approach is to learn the behavior of an operator, such that the resulting neural network is an approximate mapping in infinite-dimensional spaces that is capable of (approximately) simulating the solution operator governed by the partial differential equation. In our work, we study some general approximation capabilities for linear differential operators by approximating the corresponding symbol in the Fourier domain. Analogous to the structure of the class of Hörmander symbols, we consider the approximation with respect to a topology that is induced by a sequence of semi-norms. In that sense, we measure the approximation error in terms of a Fréchet metric, and our main result identifies sufficient conditions for achieving a predefined approximation error. We then focus on a natural extension of our main theorem, in which we reduce the assumptions on the sequence of seminorms. Based on existing approximation results for the exponential spectral Barron space, we then present a concrete example of symbols that can be approximated well.

  • Research Article
  • 10.1162/neco.a.32
Estimating Phase From Observed Trajectories Using the Temporal 1-Form.
  • Nov 18, 2025
  • Neural computation
  • Simon Wilshin + 3 more

Oscillators are ubiquitous in nature and are usually associated with the existence of an asymptotic phase that governs the long-term dynamics of the oscillator. We show that the asymptotic phase can be estimated using a carefully chosen series expansion that directly computes the phase response curve (PRC) and provides an algorithm for estimating the coefficients of this series. Unlike previously available data-driven phase estimation methods, our algorithm can use observations that are much shorter than a cycle; has proven convergence rate bounds as a function of the properties of measurement noise and system noise; will recover phase within any forward invariant region for which sufficient data are available; recovers the PRCs that govern weak oscillator coupling; and recovers isochron curvature and recovers nonlinear features of isochron geometry. Our method may find application wherever models of oscillator dynamics need to be constructed from measured or simulated time-series.

  • Research Article
  • 10.1162/neco.a.31
Boosting MCTS With Free Energy Minimization.
  • Nov 18, 2025
  • Neural computation
  • Mawaba Pascal Dao + 1 more

Active inference, grounded in the free energy principle, provides a powerful lens for understanding how agents balance exploration and goal-directed behavior in uncertain environments. Here, we propose a new planning framework that integrates Monte Carlo tree search (MCTS) with active inference objectives to systematically reduce epistemic uncertainty while pursuing extrinsic rewards. Our key insight is that MCTS, already renowned for its search efficiency, can be naturally extended to incorporate free energy minimization by blending expected rewards with information gain. Concretely, the cross-entropy method (CEM) is used to optimize action proposals at the root node, while tree expansions leverage reward modeling alongside intrinsic exploration bonuses. This synergy allows our planner to maintain coherent estimates of value and uncertainty throughout planning, without sacrificing computational tractability. Empirically, we benchmark our planner on a diverse set of continuous control tasks, where it demonstrates performance gains over both stand-alone CEM and MCTS with random rollouts.

  • Research Article
  • 10.1162/neco.a.33
Fusing Foveal Fixations Using Linear Retinal Transformations and Bayesian Experimental Design.
  • Nov 18, 2025
  • Neural computation
  • Christopher K I Williams

Humans (and many vertebrates) face the problem of fusing together multiple fixations of a scene in order to obtain a representation of the whole, where each fixation uses a high-resolution fovea and decreasing resolution in the periphery. In this letter, we explicitly represent the retinal transformation of a fixation as a linear downsampling of a high-resolution latent image of the scene, exploiting the known geometry. This linear transformation allows us to carry out exact inference for the latent variables in factor analysis (FA) and mixtures of FA models of the scene. This also allows us to formulate and solve the choice of where to look next as a Bayesian experimental design problem using the expected information gain criterion. Experiments on the Frey faces and MNIST data sets demonstrate the effectiveness of our models.

  • Research Article
  • 10.1162/neco.a.36
Working Memory and Self-Directed Inner Speech Enhance Multitask Generalization in Active Inference.
  • Oct 29, 2025
  • Neural computation
  • Jeffrey Frederic Queißer + 1 more

This simulation study shows how a set of working memory tasks can be acquired simultaneously through interaction between a stacked recurrent neural network (RNN) and multiple working memories. In these tasks, temporal patterns are provided, followed by linguistically specified task goals. Training is performed in a supervised manner by minimizing the free energy, and goal-directed tasks are performed using the active inference (AIF) framework. Our simulation results show that the best task performance is obtained when two working memory modules are used instead of one or none and when self-directed inner speech is incorporated during task execution. Detailed analysis indicates that a temporal hierarchy develops in the stacked RNN module under these optimal conditions. We argue that the model's capacity for generalization across novel task configurations is supported by the structured interplay between working memory and the generation of self-directed language outputs during task execution. This interplay promotes internal representations that reflect task structure, which in turn support generalization by enabling a functional separation between content encoding and control dynamics within the memory architecture.