Abstract

The axiomatic fully probabilistic design (FDP) of decision strategies strictly extends Bayesian decision making (DM) theory. FPD also models the closed decision loop by ajoint probability density (pd) of all inspected random variables, referred as behaviour. FPD expresses DM aims via an ideal pd of behaviours, unlike the usual DM. Its optimal strategy minimises Kullback–Leibler divergence (KLD) of the joint, strategy-dependent, pd of behaviours to its ideal twin. A range of FPD results confirmed its theoretical and practical strength. Curiously, no guide exists how to select a specific ideal pd for an estimator design. The paper offers it. It advocates the use of the closed-loop state notion and generalises dynamic programming so that FPD is its special case. Primarily, it provides an explorative optimised feedback that “naturally” diminishes exploration (gained in learning) as the learning progresses.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call