Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Decision making for lunar landing applications using AI agents and reinforcement learning

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

Decision making for lunar landing applications using AI agents and reinforcement learning

Similar Papers
  • Supplementary Content
  • 10.25394/pgs.12221960.v1
Game AI of StarCraft II based on Deep Reinforcement Learning
  • Apr 30, 2020
  • Figshare
  • Junjie Luo

The research problem of this article is the Game AI agent of StarCraft II based on Deep Reinforcement Learning (DRL). StarCraft II is viewed as the most challenging Real-time Strategy (RTS) game for now, and it is also the most popular game where researchers are developing and improving AI agents. Building AI agents of StarCraft II can help researchers on machine learning figure out the weakness of DRL and improve this series of algorithms. In 2018, DeepMind and Blizzard developed the StarCraft II Learning Environment (PySC2) to enable researchers to promote the development of AI agents. DeepMind started to develop a new project called AlphaStar after AlphaGo based on DRL, while several laboratories also published articles about the AI agents of StarCraft II. Most of them are researching on the AI agents of Terran and Zerg, which are two of three races in StarCraft II. AI agents show high-level performance compared with most StarCraft II players. However, the performance is far from defeating E-sport players because Game AI for StarCraft II has large observation space and large action space. However, there is no publication on Protoss, which is the remaining and most complicated race to deal with (larger action space, larger observation space) for AI agents due to its characteristics. Thus, in this paper, the research question is whether the AI agent of Protoss, which is developed by the model based on DRL, for a full-length game on a particular map can defeat the high-level built-in cheating AI. The population of this research design is the StarCraft II AI agents that researchers built based on their DRL models, while the sample is the Protoss AI agent in this paper. The raw data is from the game matches between the Protoss AI agent and built-in AI agents. PySC2 can capture features and numerical variables in each match to obtain the training data. The expected outcome is the model based on DRL, which can train a Protoss AI agent to defeat high-level game AI agents with the win rate. The model includes the action space of Protoss, the observation space and the realization of DRL algorithms. Meanwhile, the model is built on PySC2 v2.0, which provides additional action functions. Due to the complexity and the unique characteristics of Protoss in StarCraft II, the model cannot be applied to other games or platforms. However, how the model trains a Protoss AI agent can show the limitation of DRL and push DRL algorithm a little forward.

  • Dissertation
  • 10.63227/419.772.80
Problem Solving for Industry
  • Oct 20, 2022
  • Jozimar Basilio Ferreira + 3 more

This project seeks to use reinforcement learning to develop AI agents used to controlled NPCs in video game worlds that are capable of mastering decision tasks in their video game environments. Our job will be to develop algorithms and methods that can effectively train the AI agents using Reinforcement learning, which can be used in various gaming environments and scenarios such as racing games and first-person shooters. We then market these agents to video game developers for use in their game worlds. The developer can use our agents as-is in their game without modifications or they can train them further, using our algorithms, to tune the AI agents with various behaviours and capability with minimal or no need to write the code themselves. With the use of reinforcement learning, our AI agents will learn using trial and error with rewards used to provide feedback to the AI. Over time the AI will master its environment and other AI and even possibly interaction with the human-gamer. This will produce AI controlled NPCs that behave and interact convincingly with their environments and the player, promoting player immersions while reducing developer workload.

  • Conference Article
  • Cite Count Icon 3
  • 10.1145/3462244.3479932
Toddler-Guidance Learning: Impacts of Critical Period on Multimodal AI Agents
  • Oct 18, 2021
  • Junseok Park + 6 more

Critical periods are phases during which a toddler's brain develops in spurts. To promote children's cognitive development, proper guidance is critical in this stage. However, it is not clear whether such a critical period also exists for the training of AI agents. Similar to human toddlers, well-timed guidance and multimodal interactions might significantly enhance the training efficiency of AI agents as well. To validate this hypothesis, we adapt this notion of critical periods to learning in AI agents and investigate the critical period in the virtual environment for AI agents. We formalize the critical period and Toddler-guidance learning in the reinforcement learning (RL) framework. Then, we built up a toddler-like environment with VECA toolkit to mimic human toddlers' learning characteristics. We study three discrete levels of mutual interaction: weak-mentor guidance (sparse reward), moderate mentor guidance (helper-reward), and mentor demonstration (behavioral cloning). We also introduce the EAVE dataset consisting of 30,000 real-world images to fully reflect the toddler's viewpoint. We evaluate the impact of critical periods on AI agents from two perspectives: how and when they are guided best in both uni- and multimodal learning. Our experimental results show that both uni- and multimodal agents with moderate mentor guidance and critical period on 1 million and 2 million training steps show a noticeable improvement. We validate these results with transfer learning on the EAVE dataset and find the performance advancement on the same critical period and the guidance.

  • PDF Download Icon
  • Research Article
  • 10.32628/cseit2390629
Walking and Survival AI Using Reinforcement Learning - Simulation
  • Mar 14, 2024
  • International Journal of Scientific Research in Computer Science Engineering and Information Technology
  • Bharate Nandan Lahudeo + 3 more

This research paper presents a novel approach to training an AI agent for walking and survival tasks using reinforcement learning (RL) techniques. The primary research question addressed in this study is how to develop an AI system capable of autonomously navigating diverse terrains and environments while ensuring survival through adaptive decision-making. To investigate this question, we employ RL algorithms, specifically deep Q-networks (DQN) and proximal policy optimization (PPO), to train an AI agent in simulated environments that mimic real-world challenges. Our methodology involves designing a virtual environment where the AI agent learns to walk and make survival-related decisions through trial and error. The agent receives rewards or penalties based on its actions, encouraging the development of strategies that optimize both locomotion and survival skills. We evaluate the performance of our approach through extensive experimentation, testing the AI agent's adaptability to various terrains, obstacles, and survival scenarios.

  • Research Article
  • 10.13031/aea.16327
Masking Actions in Reinforcement Learning: Enhanced PPO for Optimal Harvesting
  • Jan 1, 2025
  • Applied Engineering in Agriculture
  • Jaime Álvarez Urueña + 4 more

Highlights Training of an AI agent able to drive on herbaceous crop fields efficiently with no redundancy (65.2% of area covered with a clipped number of actions). Creation of a framework to train AI agents to drive autonomously on herbaceous crop fields. Development of a DRL policy to mask forbidden actions, easing the training phase of AI agents. Abstract. Considering the landscape of today’s global agricultural sector, it is essential to align resource optimization and productivity with the development of long-term sustainable practices. Additionally, challenges such as labor shortages and the high costs of developing and maintaining traditional machinery arise. In a context defined by increasing competitiveness and the advancement of new automation technologies, developing deep reinforcement learning (DRL) algorithms emerges as an ideal solution to meet the sector’s demands. A review of the existing literature reveals that previous research on the subject encompasses various applications of DRL models for specific agricultural tasks and regions. This work proposes a trained AI agent capable of driving agricultural tractors autonomously on any kind of 2D field, regardless of its shape or size. A novel framework based on deep reinforcement learning has been developed to train the model. This framework incorporates a fully customizable reward-penalty layer, reinforcement learning policies, field shapes and sizes, tractor configurations, and neural network architectures. A novel DRL policy incorporating action masking to exclude forbidden actions is also proposed, accelerating convergence and enhancing the agent's learning efficiency. A comprehensive statistical test compares distinct agents trained on different policies and approaches. Selecting the best performing agent renders a mean covered area of 65.2% with a clipped number of actions (250). Keywords: Agriculture, Autonomous driving, Coverage path planning, Deep reinforcement learning, Navigation, PPO.

  • Research Article
  • 10.1088/1674-4527/ae2d0c
AI Agent for Source Finding by SoFiA-2 for SKA-SDC2
  • Jan 15, 2026
  • Research in Astronomy and Astrophysics
  • Xingchen Zhou + 9 more

Source extraction is crucial in analyzing data from next-generation, large-scale sky surveys in radio bands, such as the Square Kilometre Array (SKA). Several source extraction programs, including SoFiA and Aegean, have been developed to address this challenge. However, finding optimal parameter configurations when applying these programs to real observations is non-trivial. For example, the outcomes of SoFiA intensely depend on several key parameters across its preconditioning, source-finding, and reliability-filtering modules. To address this issue, we propose a framework to automatically optimize these parameters using an AI agent based on a state-of-the-art reinforcement learning (RL) algorithm, i.e., Soft Actor-Critic (SAC). The SKA Science Data Challenge 2 (SDC2) dataset is utilized to assess the feasibility and reliability of this framework. The AI agent interacts with the environment by adjusting parameters based on the feedback from the SDC2 score defined by the SDC2 Team, progressively learning to select parameter sets that yield improved performance. After sufficient training, the AI agent can automatically identify an optimal parameter configuration that outperform the benchmark set by Team SoFiA within only 100 evaluation steps and with reduced time consumption. Our approach could address similar problems requiring complex parameter tuning, beyond radio band surveys and source extraction. Yet, high-quality training sets containing representative observations and catalogs of ground truth are essential.

  • Research Article
  • 10.60087/jaigs.v6i1.398
Predictive Modeling for Autonomous Detection and Correction of AI-Agent Hallucinations Using Transformer Networks
  • Oct 7, 2024
  • Journal of Artificial Intelligence General science (JAIGS) ISSN:3006-4023
  • Jegatheeswari Perumalsamy + 1 more

Hallucinations in AI agents’ instances where generated outputs deviate from factual or intended information pose significant risks in high-stakes domains such as autonomous decision-making, medical diagnostics, and legal analysis. This research presents a predictive modeling framework for the autonomous detection and correction of AI-agent hallucinations using transformer-based architectures. The proposed method integrates multi-stage attention mechanisms, semantic consistency scoring, and contextual anomaly detection to identify hallucination patterns in real-time. A corrective submodule, trained via supervised fine-tuning and reinforcement learning from human feedback (RLHF), dynamically adjusts outputs toward verifiable ground truth without requiring human intervention. Experiments conducted on benchmark datasets across open-domain QA, dialogue systems, and multimodal reasoning tasks show a substantial reduction in hallucination rates while preserving fluency and relevance. The findings highlight the potential of transformer-driven predictive models to improve the trustworthiness and reliability of autonomous AI agents in critical applications.

  • Conference Article
  • 10.1109/iccp51029.2020.9266153
Keynote Lecture: Building Knowledge For AI AgentsWith Reinforcement Learning
  • Sep 3, 2020
  • Doina Precup

Reinforcement learning allows autonomous agents to learn how to act in a stochastic, unknown environment, with which they can interact. Deep reinforcement learning, in particular, has achieved great success in well-defined application domains, such as Go or chess, in which an agent has to learn how to act and there is a clear success criterion. In this talk, I will focus on the potential role of reinforcement learning as a tool for building knowledge representations in AI agents whose goal is to perform continual learning. I will examine a key concept in reinforcement learning, the value function, and discuss its generalization to support various forms of predictive knowledge. I will also discuss the role of temporally extended actions, and their associated predictive models, in learning procedural knowledge. In order to tame the possible complexity of learning knowledge representations, reinforcement learning agents can use the concepts of intents (ie intended consequences of courses of actions) and affordances (which capture knowlege about where actions can be applied). Finally, I will discuss the challenge of how to evaluate reinforcement learning agents whose goal is not just to control their environment, but also to build knowledge about their world.

  • Conference Article
  • 10.65109/ugvb1408
Building Knowledge for AI Agents with Reinforcement Learning
  • May 8, 2019
  • Doina Precup

Reinforcement learning allows autonomous agents to learn how to act in a stochastic, unknown environment, with which they can interact. Deep reinforcement learning, in particular, has achieved great success in well-defined application domains, such as Go or chess, in which an agent has to learn how to act and there is a clear success criterion. In this talk, I will focus on the potential role of reinforcement learning as a tool for building knowledge representations in AI agents whose goal is to perform continual learning. I will examine a key concept in reinforcement learning, the value function, and discuss its generalization to support various forms of predictive knowledge. I will also discuss the role of temporally extended actions, and their associated predictive models, in learning procedural knowledge. Finally, I will discuss the challenge of how to evaluate reinforcement learning agents whose goal is not just to control their environment, but also to build knowledge about their world.

  • Research Article
  • Cite Count Icon 5
  • 10.1002/alz.041034
Scalable diagnostic screening of mild cognitive impairment using AI dialogue agent
  • Dec 1, 2020
  • Alzheimer's & Dementia
  • Fengyi Tang + 4 more

BackgroundThe search for early biomarkers of mild cognitive impairment (MCI) has been central to Alzheimer's Disease (AD) and the dementia research community in recent years. While there exist in‐vivo biomarkers (e.g., beta‐amyloid and tau) that can serve as indicators of pathological progression toward AD, biomarker screenings are prohibitively expensive to scale if widely used among pre‐symptomatic individuals in the outpatient setting. Behavior and social markers such as language, speech, and conversational behaviors reflect cognitive changes that may precede physical changes and offer a much more cost‐effective option for preclinical MCI detection, especially if they can be extracted from a non‐clinical setting.MethodWe developed a prototype AI conversational agent that conducts screening conversations with participants. Specifically, this AI agent must learn to ask the right sequence of questions to distinguishing the conversational characteristics of the participants with MCI from those with normal cognition. Using transcribed data obtained from recorded conversational interactions between participants and trained interviewers generated in a recently completed clinical trial, and applying supervised learning models to these data, we developed a novel reinforcement learning (RL) pipeline and a dialogue simulation environment to train an efficient dialogue agent to explore a range of semi‐structured questions. We train and validate our AI dialogue agent based on transcribed data from a randomized controlled behavioral intervention study, where we use the transcribed data from 41 subjects (14 MCI, 27 NL). Each subject has an average of 35 turns of dialogue on average.ResultThe results show that while using only a few turns of conversation, our framework can significantly outperform state‐of‐the‐art supervised learning approaches used in a past study. An AI agent of 30 turns of dialogue achieves over 0.853 Area Under the Receiver Operating Characteristic Curves (AUC) and 0.809 AUC with 20 turns, as compared to 0.811 AUC with the full dialogue turns.ConclusionOur dialogue‐based AI agent presents a step toward using AI to extend clinical care beyond the classical hospital and clinical settings, where we find that AI‐generated dialogues produce more predictive linguistic markers.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 8
  • 10.3390/a16050235
Official International Mahjong: A New Playground for AI Research
  • Apr 28, 2023
  • Algorithms
  • Yunlong Lu + 2 more

Games have long been benchmarks and testbeds for AI research. In recent years, with the development of new algorithms and the boost in computational power, many popular games played by humans have been solved by AI systems. Mahjong is one of the most popular games played in China and has been spread worldwide, which presents challenges for AI research due to its multi-agent nature, rich hidden information, and complex scoring rules, but it has been somehow overlooked in the community of game AI research. In 2020 and 2022, we held two AI competitions of Official International Mahjong, the standard variant of Mahjong rules, in conjunction with a top-tier AI conference called IJCAI. We are the first to adopt the duplicate format in evaluating Mahjong AI agents to mitigate the high variance in this game. By comparing the algorithms and performance of AI agents in the competitions, we conclude that supervised learning and reinforcement learning are the current state-of-the-art methods in this game and perform much better than heuristic methods based on human knowledge. We also held a human-versus-AI competition and found that the top AI agent still could not beat professional human players. We claim that this game can be a new benchmark for AI research due to its complexity and popularity among people.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/icetci55171.2022.9921374
Playing football game using AI agents
  • Aug 25, 2022
  • Koyel Datta Gupta + 3 more

A lot of effort has been put into training AI agents to play games like chess, connect-4 etc where they excelled and taught us new ways to approach problems. Furtheron, recent progress in this field is accelerated by broadening horizons and taking on complex environments like GO. Football is also such a complex setting which requires the agent to learn intricate concepts like passing, shooting, dribbling etc and develop tactics to maximize chances of winning. And hence we take on the game of football using AI agents. We use the google reinforcement learning environment to train and evaluate our agents. We solve this problem of playing football by training two different agents namely Deep Q Networks and Light GBM, where Deep Q Network is a self-learning algorithm based on reinforcement learning and Light GBM is a supervised learning algorithm and dataset for this algorithm is extracted through kaggle.

  • Book Chapter
  • Cite Count Icon 7
  • 10.1007/978-3-319-13560-1_6
Integrating Case-Based Reasoning with Reinforcement Learning for Real-Time Strategy Game Micromanagement
  • Jan 1, 2014
  • Stefan Wender + 1 more

This paper describes the conception of a hybrid Reinforcement Learning (RL) and Case-Based Reasoning (CBR) approach to managing combat units in strategy games. Both methods are combined into an AI agent that is evaluated by using the real-time strategy (RTS) computer game StarCraft as a test bed. The eventual aim of this approach is an AI agent that has the same actions and information at its disposal as a human player. As part of an experimental evaluation, the agent is tested in different scenarios using optimized algorithm parameters. The integration of CBR for memory management is shown to improve the speed of convergence to an optimal policy, while also enabling the agent to address a larger variety of problems when compared to simple RL. The agent manages to beat the built-in game AI and also outperforms a simple RL-only agent. An analysis of the evolution of the case-base shows how scenarios and algorithmic parameters influence agent performance and will serve as a foundation for future improvement to the hybrid CBR/RL approach.

  • Research Article
  • Cite Count Icon 1
  • 10.32628/cseit25112732
Autonomous AI Agents in Online Retail: The Next Leap in Programmatic Media Buying
  • Mar 28, 2025
  • International Journal of Scientific Research in Computer Science, Engineering and Information Technology
  • Ameya Gokhale

The retail industry stands at the brink of transformation driven by autonomous AI agents that will redefine shopping experiences, optimize advertising strategies, and streamline seller onboarding. AI agents will be personalized shopping assistants, intelligent advertising optimizers, and automated seller support systems, creating a seamless and highly efficient retail ecosystem. This technological evolution will personalize consumer interactions, automate advertising campaign management, and lower entry barriers for sellers, making e-commerce more accessible and profitable for all stakeholders. The article examines how AI agents will revolutionize multiple facets of retail, from AI-enhanced augmented reality shopping and group buying experiences to dynamic ad targeting and autonomous ad buying with reinforcement learning, ultimately delivering unprecedented value across the retail value chain through increased efficiency, personalization, and accessibility.

  • Research Article
  • 10.57020/ject.1757814
Human-like Competitive Video Game AI Through Reinforcement Learning
  • Dec 31, 2025
  • Journal of Emerging Computer Technologies
  • Can Çelenay + 1 more

With the rise of competitive and multiplayer video games, the demand for non-player characters that can provide meaningful training and practice experiences has increased. With the rise of multiplayer games, developers increasingly require AI-controlled opponents that the players can play against to learn the game, to practice, or to just play by themselves. These AI bots are commonly made with state machines that are manually programmed by programmers. Using state machines for AI players is not only laborintensive but also often results in bots that exhibit predictable and rigid behavior, which can reduce the perception of human-like interaction. In this study, an AI agent was trained using reinforcement learning to play a two-player competitive fighting game, and its behavior was evaluated through gameplay sessions against 17 human participants with varying levels of gaming experience. At the end of our study, the results suggest that training AI agents capable of eliciting a perception of human-like gameplay is feasible within the scope of the studied environment and the integration of the said AI agents is possible through the use of portable technologies.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant