Abstract

During the mid-course phase of an air-to-air missile, choosing the optimal Guidance Point (GP) so as to maximize lock-on success and minimize intercept time is critical. Given low computational resources available on board and a very constrained maneuvering time frame, GP-based algorithms must be efficient. We suggest an innovative approach using Reinforcement Learning (RL) to produce finite state controllers that can be executed efficiently – using table lookup – to meet the strict time limits of a target engagement. Instead of hand-crafting a GP-picking algorithm for every combination of sensor and aircraft configuration, one promising alternative models a missile-target engagement as a Partially Observable Markov Decision Process (POMDP) and automatically generates a controller for picking the best GP by solving the POMDP model. Using a recently developed offline algorithm called Monte Carlo Value Iteration (MCVI) we constructed continuous-state POMDP models and solved them directly, without discretizing the entire state space.Invited session “Missile Guidance Navigation & Control” (pm846)

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call