Universal Learning Waveform Selection Strategies for Adaptive Target Tracking

Charles E Thornton,Harpreet S Dhillon,R Michael Buehrer,Anthony F Martone

doi:10.1109/taes.2022.3181554

Charles E Thornton, Harpreet S Dhillon + Show 2 more

Open Access

https://doi.org/10.1109/taes.2022.3181554

Copy DOI

Abstract

Online selection of optimal waveforms for target tracking with active sensors has long been a problem of interest. Many conventional solutions utilize an estimation-theoretic interpretation, in which a waveform-specific Cramér–Rao lower bound on measurement error is used to select the optimal waveform for each tracking step. However, this approach is only valid in the high SNR regime, and requires a rather restrictive set of assumptions regarding the target motion and measurement models. Furthermore, due to computational concerns, many traditional approaches are limited to near-term, or <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">myopic</i> , optimization, even though radar scenes exhibit strong temporal correlation. More recently, reinforcement learning has been proposed for waveform selection, in which the problem is framed as a Markov decision process, allowing for long-term planning. However, a major limitation of reinforcement learning is that the memory length of the underlying Markov process is often unknown for realistic target and channel dynamics, and a more general framework is desirable. This work develops a universal sequential waveform selection scheme which asymptotically achieves Bellman optimality in any radar scene, which can be modeled as a <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$U{\text{th}}$</tex-math></inline-formula> -order Markov process for a finite, but unknown, integer <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$U$</tex-math></inline-formula> . Our approach is based on well-established tools from the field of universal source coding, where a stationary source is parsed into variable length phrases in order to build a <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">context-tree</i> , which is used as a probabilistic model for the scene’s behavior. We show that an algorithm based on a multialphabet version of the context-tree weighting method can be used to optimally solve a broad class of waveform-agile tracking problems while making minimal assumptions about the environment’s behavior.

Full Text