Episodic memory transfer for multi-task reinforcement learning
Episodic memory transfer for multi-task reinforcement learning
154
- 10.2200/s00737ed1v01y201610aim033
- Nov 7, 2016
- Synthesis Lectures on Artificial Intelligence and Machine Learning
143
- 10.18653/v1/d17-1106
- Jan 1, 2017
2816
- 10.1016/s0004-3702(99)00052-1
- Aug 1, 1999
- Artificial Intelligence
546
- 10.1016/j.tics.2016.05.004
- Jun 14, 2016
- Trends in Cognitive Sciences
85568
- 10.1162/neco.1997.9.8.1735
- Nov 1, 1997
- Neural computation
151
- 10.1609/aaai.v31i1.10744
- Feb 12, 2017
- Proceedings of the AAAI Conference on Artificial Intelligence
1294
- 10.1613/jair.639
- Nov 1, 2000
- Journal of Artificial Intelligence Research
1279
- 10.1038/nature20101
- Oct 12, 2016
- Nature
1846
- 10.1017/s0140525x16001837
- Nov 24, 2016
- Behavioral and Brain Sciences
53
- 10.1609/aimag.v32i1.2329
- Mar 1, 2011
- AI Magazine
- Research Article
5
- 10.3233/jifs-190209
- Dec 23, 2019
- Journal of Intelligent & Fuzzy Systems
Particle swarm optimization based multi-task parallel reinforcementlearning algorithm
- Research Article
1
- 10.1109/lra.2024.3416796
- Sep 1, 2025
- IEEE Robotics and Automation Letters
Bi-Phase Episodic Memory-Guided Deep Reinforcement Learning for Robot Skills
- Conference Article
1
- 10.1109/cacre50138.2020.9230141
- Sep 1, 2020
Tandem force sensor is an important sensor in human-robot collaboration and learning from demonstration, equipped with which robot can perceive the environment state and human attention simultaneously. The posture of a tandem force sensor changes with robot’s posture, which leads to the output of the sensor cannot accurately reflect the environmental states and human intention. In order to compensate the measurement error caused by attitude change, the numerical compensation method of tandem force sensor based on deep learning is proposed in this paper. And the experimental results of numerical compensation show that this method can effectively eliminate the influence of attitude change on the measurement of the sensor.
- Conference Article
6
- 10.1109/cog52621.2021.9619000
- Aug 17, 2021
The idea of transfer in reinforcement learning (TRL) is intriguing: being able to transfer knowledge from one problem to another problem without learning everything from scratch. This promises quicker learning and learning more complex methods. To gain an insight into the field and to detect emerging trends, we performed a database search. We note a surprisingly late adoption of deep learning that starts in 2018. The introduction of deep learning has not yet solved the greatest challenge of TRL: generalization. Transfer between different domains works well when domains have strong similarities (e.g. MountainCar to Cartpole), and most TRL publications focus on different tasks within the same domain that have few differences. Most TRL applications we encountered compare their improvements against self-defined baselines, and the field is still missing unified benchmarks. We consider this to be a disappointing situation. For the future, we note that: (1) A clear measure of task similarity is needed. (2) Generalization needs to improve. Promising approaches merge deep learning with planning via MCTS or introduce memory through LSTMs. (3) The lack of benchmarking tools will be remedied to enable meaningful comparison and measure progress. Already Alchemy and Meta-World are emerging as interesting benchmark suites. We note that another development, the increase in procedural content generation (PCG), can improve both benchmarking and generalization in TRL.
- Research Article
18
- 10.1162/neco_a_00149
- Apr 26, 2011
- Neural Computation
The multimodal self-organizing network (MMSON), an artificial neural network architecture carrying out sensory integration, is presented here. The architecture is designed using neurophysiological findings and imaging studies that pertain to sensory integration and consists of interconnected lattices of artificial neurons. In this artificial neural architecture, the degree of recognition of stimuli, that is, the perceived reliability of stimuli in the various subnetworks, is included in the computation. The MMSON's behavior is compared to aspects of brain function that deal with sensory integration. According to human behavioral studies, integration of signals from sensory receptors of different modalities enhances perception of objects and events and also reduces time to detection. In neocortex, integration takes place in bimodal and multimodal association areas and result, not only in feedback-mediated enhanced unimodal perception and shortened reaction time, but also in robust bimodal or multimodal percepts. Simulation data from the presented artificial neural network architecture show that it replicates these important psychological and neuroscientific characteristics of sensory integration.
- Research Article
819
- 10.1093/nsr/nwx105
- Sep 1, 2017
- National Science Review
As a promising area in machine learning, multi-task learning (MTL) aims to improve the performance of multiple related learning tasks by leveraging useful information among them. In this paper, we give an overview of MTL by first giving a definition of MTL. Then several different settings of MTL are introduced, including multi-task supervised learning, multi-task unsupervised learning, multi-task semi-supervised learning, multi-task active learning, multi-task reinforcement learning, multi-task online learning and multi-task multi-view learning. For each setting, representative MTL models are presented. In order to speed up the learning process, parallel and distributed MTL models are introduced. Many areas, including computer vision, bioinformatics, health informatics, speech, natural language processing, web applications and ubiquitous computing, use MTL to improve the performance of the applications involved and some representative works are reviewed. Finally, recent theoretical analyses for MTL are presented.
- Conference Article
6
- 10.18653/v1/w19-5119
- Jan 1, 2019
Recent developments in deep learning have prompted a surge of interest in the application of multitask and transfer learning to NLP problems. In this study, we explore for the first time, the application of transfer learning (TRL) and multitask learning (MTL) to the identification of Multiword Expressions (MWEs). For MTL, we exploit the shared syntactic information between MWE and dependency parsing models to jointly train a single model on both tasks. We specifically predict two types of labels: MWE and dependency parse. Our neural MTL architecture utilises the supervision of dependency parsing in lower layers and predicts MWE tags in upper layers. In the TRL scenario, we overcome the scarcity of data by learning a model on a larger MWE dataset and transferring the knowledge to a resource-poor setting in another language. In both scenarios, the resulting models achieved higher performance compared to standard neural approaches.
- Research Article
90
- 10.1093/gerona/gly166
- Jul 19, 2018
- The Journals of Gerontology: Series A
Polyphenols are promising nutritional bioactives exhibiting beneficial effect on age-related cognitive decline. This study evaluated the effect of a polyphenol-rich extract from grape and blueberry (PEGB) on memory of healthy elderly subjects (60-70 years-old). A bicentric, randomized, double-blind, placebo-controlled trial was conducted with 215 volunteers receiving 600 mg/day of PEGB (containing 258 mg flavonoids) or a placebo for 6 months. The primary outcome was the CANTAB Paired Associate Learning (PAL), a visuospatial learning and episodic memory test. Secondary outcomes included verbal episodic and recognition memory (VRM) and working memory (SSP). There was no significant effect of PEGB on the PAL on the whole cohort. Yet, PEGB supplementation improved VRM-free recall. Stratifying the cohort in quartiles based on PAL at baseline revealed a subgroup with advanced cognitive decline (decliners) who responded positively to the PEGB. In this group, PEGB consumption was also associated with a better VRM-delayed recognition. In addition to a lower polyphenol consumption, the urine metabolomic profile of decliners revealed that they excreted more metabolites. Urinary concentrations of specific flavan-3-ols metabolites were associated, at the end of the intervention, with the memory improvements. Our study demonstrates that PEGB improves age-related episodic memory decline in individuals with the highest cognitive impairments.
- Research Article
- 10.1523/jneurosci.0911-24.2025
- Apr 22, 2025
- The Journal of neuroscience : the official journal of the Society for Neuroscience
Patients with Parkinson's disease (PD) are impaired at incremental reward-based learning. It is typically assumed that this impairment reflects a loss of striatal dopamine. However, many open questions remain about the nature of reward-based learning deficits in PD. Recent studies have found that even simple reward-based learning tasks rely on a combination of cognitive and computational strategies, including one-shot episodic memory. These findings raise questions about how incremental learning and episodic memory contribute to decision-making in PD. We tested healthy participants (n = 26; 14 males and 12 females) and patients with PD (n = 26; 16 males and 10 females), both on- and off-dopamine replacement medication, on a task designed to differentiate between the contributions of incremental learning and episodic memory to reward-based learning and decision-making. We found that PD patients performed equally well as healthy controls when using episodic memory but were impaired at incremental reward-based learning. Dopamine replacement medication remediated this deficit and enhanced subsequent episodic memory for the value of motivationally relevant stimuli. These results demonstrate that while PD patients are impaired at learning about reward from trial-and-error, their ability to encode memories for the value of one-shot experiences is intact.
- Research Article
1
- 10.1101/2024.05.03.592414
- May 3, 2024
- bioRxiv : the preprint server for biology
Patients with Parkinson's disease are impaired at incremental reward-based learning. It is typically assumed that this impairment reflects a loss of striatal dopamine. However, many open questions remain about the nature of reward-based learning deficits in Parkinson's. Recent studies have found that a combination of different cognitive and computational strategies contribute even to simple reward-based learning tasks, suggesting a possible role for episodic memory. These findings raise critical questions about how incremental learning and episodic memory interact to support learning from past experience and what their relative contributions are to impaired decision-making in Parkinson's disease. Here we addressed these questions by asking patients with Parkinson's disease (n=26) both on and off their dopamine replacement medication and age- and education-matched healthy controls (n=26) to complete a task designed to isolate the contributions of incremental learning and episodic memory to reward-based learning and decision-making. We found that Parkinson's patients performed as well as healthy controls when using episodic memory, but were impaired at incremental reward-based learning. Dopamine replacement medication remediated this deficit while enhancing subsequent episodic memory for the value of motivationally relevant stimuli. These results demonstrate that Parkinson's patients are impaired at learning about reward from trial-and-error when episodic memory is properly controlled for, and that learning based on the value of single experiences remains intact in patients with Parkinson's disease.
- Book Chapter
- 10.1007/978-981-16-4258-6_110
- Jan 1, 2022
The identification tech is reliable only when the test accuracy is high enough, in order to identify the handwritten numbers’ author and numbers in various applications, this paper focus on how to improve the accuracy of identifying handwritten numbers’ authors and numbers, especially the author identification in deep learning. Comparing with number identification which has been widely used in many fields, author identification is more difficult because the differences between different numbers are very easy to be extracted, however for same numbers that different authors wrote, it is hard to find out the feature points. The problem that has to be solved is what framework is the best choice to improve the test accuracy, CNN is often used in image recognition as required in this experiment, and different methods like transfer learning, multi-task learning and multi-task transfer learning are expected to improve the accuracy of identifying handwritten numbers’ author and numbers as well. Four main methods are described in detail: (1) enlarge the number of samples in a data set, (2) transfer learning, (3) multi-task learning, (4) multi-task transfer learning.KeywordsTransfer learningMulti-task learningCNNAuthor identification of handwritten numbers
- Research Article
7
- 10.3389/fpsyg.2023.1160648
- Apr 17, 2023
- Frontiers in Psychology
Episodic memory has been studied extensively in the past few decades, but so far little is understood about how it drives future behavior. Here we propose that episodic memory can facilitate learning in two fundamentally different modes: retrieval and replay, which is the reinstatement of hippocampal activity patterns during later sleep or awake quiescence. We study their properties by comparing three learning paradigms using computational modeling based on visually-driven reinforcement learning. Firstly, episodic memories are retrieved to learn from single experiences (one-shot learning); secondly, episodic memories are replayed to facilitate learning of statistical regularities (replay learning); and, thirdly, learning occurs online as experiences arise with no access to memories of past experiences (online learning). We found that episodic memory benefits spatial learning in a broad range of conditions, but the performance difference is meaningful only when the task is sufficiently complex and the number of learning trials is limited. Furthermore, the two modes of accessing episodic memory affect spatial learning differently. One-shot learning is typically faster than replay learning, but the latter may reach a better asymptotic performance. In the end, we also investigated the benefits of sequential replay and found that replaying stochastic sequences results in faster learning as compared to random replay when the number of replays is limited. Understanding how episodic memory drives future behavior is an important step toward elucidating the nature of episodic memory.
- Research Article
38
- 10.1016/j.compbiomed.2020.104121
- Nov 21, 2020
- Computers in Biology and Medicine
Effect of deep transfer and multi-task learning on sperm abnormality detection
- Conference Article
1
- 10.1109/pimrc54779.2022.9977688
- Sep 12, 2022
Recent advances in Federated Learning (FL) have paved the way towards the design of novel strategies for solving multiple learning tasks simultaneously, by leveraging cooperation among networked devices. Multi-Task Learning (MTL) exploits relevant commonalities across tasks to improve efficiency compared with traditional transfer learning approaches. By learning multiple tasks jointly, significant reduction in terms of energy footprints can be obtained. This article provides a first look into the energy costs of MTL processes driven by the Model-Agnostic Meta-Learning (MAML) paradigm and implemented in distributed wireless networks. The paper targets a clustered multi-task network setup where autonomous agents learn different but related tasks. The MTL process is carried out in two stages: the optimization of a meta-model that can be quickly adapted to learn new tasks, and a task-specific model adaptation stage where the learned meta-model is transferred to agents and tailored for a specific task. This work analyzes the main factors that influence the MTL energy balance by considering a multi-task Reinforcement Learning (RL) setup in a robotized environment. Results show that the MAML method can reduce the energy bill by at least 2 times compared with traditional approaches without inductive transfer. Moreover, it is shown that the optimal energy balance in wireless networks depends on uplink/downlink and sidelink communication efficiencies.
- Book Chapter
19
- 10.1007/978-3-030-32254-0_48
- Jan 1, 2019
To address the data scarcity challenge in developing deep learning based medical imaging classification, a widely-used strategy is to leverage other available datasets in training. Three machine learning algorithms belong to this concept, namely, transfer learning (TL), multi-task learning (MTL) and semi-supervised learning (SSL). TL and MTL bring another labeled dataset usually from different categories, while SSL utilizes an unlabeled dataset from the same category. Each has proven useful for medical imaging tasks. In this work, we unified these three algorithms into one framework, to directly compare individual contribution and combine them to extract extra performance. For SSL, state-of-the-art consistency based methods were evaluated, including \(\varPi \)-Model and virtual adversarial training. Experiments were done on classifying gastric diseases given endoscopic images trained with various amount of data. It was observed that individually TL has the most while SSL has the least performance gain. When used together, their contribution build up constructively leading to further improved performance especially with larger capacity network. This work helps guide applying each or combination of TL/MTL/SSL for other medical applications.
- Research Article
8
- 10.1016/j.cogsys.2010.08.001
- Aug 8, 2010
- Cognitive Systems Research
Learning to use episodic memory
- Research Article
31
- 10.1523/jneurosci.1785-20.2020
- Dec 17, 2020
- The Journal of Neuroscience
Recent behavioral evidence implicates reward prediction errors (RPEs) as a key factor in the acquisition of episodic memory. Yet, important neural predictions related to the role of RPEs in episodic memory acquisition remain to be tested. Humans (both sexes) performed a novel variable-choice task where we experimentally manipulated RPEs and found support for key neural predictions with fMRI. Our results show that in line with previous behavioral observations, episodic memory accuracy increases with the magnitude of signed (i.e., better/worse-than-expected) RPEs (SRPEs). Neurally, we observe that SRPEs are encoded in the ventral striatum (VS). Crucially, we demonstrate through mediation analysis that activation in the VS mediates the experimental manipulation of SRPEs on episodic memory accuracy. In particular, SRPE-based responses in the VS (during learning) predict the strength of subsequent episodic memory (during recollection). Furthermore, functional connectivity between task-relevant processing areas (i.e., face-selective areas) and hippocampus and ventral striatum increased as a function of RPE value (during learning), suggesting a central role of these areas in episodic memory formation. Our results consolidate reinforcement learning theory and striatal RPEs as key factors subtending the formation of episodic memory.SIGNIFICANCE STATEMENT Recent behavioral research has shown that reward prediction errors (RPEs), a key concept of reinforcement learning theory, are crucial to the formation of episodic memories. In this study, we reveal the neural underpinnings of this process. Using fMRI, we show that signed RPEs (SRPEs) are encoded in the ventral striatum (VS), and crucially, that SRPE VS activity is responsible for the subsequent recollection accuracy of one-shot learned episodic memory associations.
- Research Article
- 10.2139/ssrn.3858274
- Jan 1, 2021
- SSRN Electronic Journal
The sudden emergence of epidemics, such as COVID-19, entails economic and social challenges requiring immediate attention from policy makers. An essential building block in implementing mitigation policies (e.g., lockdowns, testing, and vaccination) is the identification of potential hotspots, defined as locations that contribute significantly to the spatial diffusion of infections. During the initial stages of an epidemic, information related to the pathways of spatial diffusion of infection is not fully observable, making the detection of hotspots difficult. This work proposes a data-driven framework to identify hotspots using advanced analytical methodologies, specifically, a combination of interpretable long short-term memory (LSTM) model, multi-task learning, and transfer learning. Our methodology considers mobility within- and across-locations, which is the primary driving factor for the diffusion of infection over a network of connected locations. Additionally, to augment the signals of infection diffusion and the emergence of hotspots, we use transfer learning from past influenza transmission data, which follow a similar transmission mechanism as COVID-19. To illustrate the practical importance of our framework in deciding on lockdown policies, we compare the hotspots-based policy with a pure infection load-based policy and the state-wide lockdown policy used in practice. We show that the hotspots-based lockdown policy can achieve up to 21% improvement in reducing new infections as compared to an infection-based lockdown policy. In addition, we illustrate that locking down only top few hotspot counties can achieve almost similar performance as a state-wide lockdown policy used in practice. Finally, we demonstrate that the inclusion of transfer learning improves hotspot prediction accuracy by 53.4%. We also compare our model performance with the commonly used compartmental epidemiological model and demonstrate the superior prediction performance. Our paper addresses a practical problem with hotspot identification framework, which policy makers can use to improve mitigation decisions related to the control of epidemics.
- Conference Article
1
- 10.23919/ist-africa56635.2022.9845568
- May 16, 2022
Maize is an essential cereal for humans and animals worldwide, and it is one of the staple food in Kenya. One of the main challenges facing the maize crop in Kenya is the presence of diseases spreading quickly. Early recognition of maize pathogen and disease help at preventing the disease from spreading throughout the field. This paper proposes a regularized Multitask learning (MTL)–Convolutional Neural Networks (CNN) model for simultaneously identifying maize disease and its pathogen from diseased maize images. MTL allows training one model for multiple tasks at a time, which may improve the accuracy of each task by taking advantage of their commonalities. Our baseline is made of two CNN classification models, one of them being overfitting. We then build an MTL based on the two models, which increases the test accuracy of the overfitting model from 60.08% to 74.48%. The results show that the accuracy rises to 77.44% while combining MTL to the Early stopping method. However, the test accuracy goes up to 85.22 percent when MTL is combined with Early Stopping and Transfer Learning. The model is deployed to an android mobile application for maize farmers as end-users which is very important for costs reduction and time saving.
- Research Article
12
- 10.1016/j.bica.2018.10.008
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
3
- 10.1016/j.bica.2018.10.003
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
3
- 10.1016/j.bica.2018.09.003
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
- 10.1016/j.bica.2018.09.002
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
16
- 10.1016/j.bica.2018.10.004
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
9
- 10.1016/j.bica.2018.09.001
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
- 10.1016/s2212-683x(18)30159-2
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
1
- 10.1016/j.bica.2018.10.007
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
6
- 10.1016/j.bica.2018.10.006
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Research Article
- 10.1016/j.bica.2018.10.002
- Oct 1, 2018
- Biologically Inspired Cognitive Architectures
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.