Abstract

In the internet of things (IoT) environment consisting of various devices the traffic condition dynamically changes. Failure to process them in complying with the QoS requirement can significantly degrade the reliability and quality of the system. Therefore, the gateway collecting the data needs to quickly establish a new scheduling policy according to the changing traffic condition. The traditional packet scheduling schemes are not effective for IoT since the data transmission pattern is not identified in advance. Q-learning is a type of reinforcement learning that can establish a dynamic scheduling policy without any prior knowledge on the network condition. In this paper a novel Q-learning scheme is proposed which updates the Q-table and reward table based on the condition of the queues in the gateway. Computer simulation reveals that the proposed scheme significantly increases the number of packets satisfying the delay requirement while decreasing the processing time compared to the existing scheme based on Q-learning with stochastic learning automaton. And the processing time is also minimized by omitting unnecessary computation steps in selecting the action in the iterative Q-learning operations.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.