Abstract
AbstractIn cognitive radios, wideband sequential sensing plays an important role, which is able to quickly identify temporary available transmission opportunities by adaptively allocating sensing resources. This paper proposes a Markov decision process for modelling the optimal control of sequential sensing, which provides a general formulation capturing various practical features, including sampling cost, sensing requirement, sensing budget etc. For solving the optimal sensing policy, a model‐augmented deep reinforcement learning algorithm is proposed, which enjoys high learning stability and efficiency, compared to conventional reinforcement learning algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.