Abstract

Network congestion control is an important means to improve network throughput and reduce data transmission delay. To further optimize the network data transmission capability, this research suggests a proximal policy optimization-based intelligent TCP congestion management method, creates a proxy that can communicate with the real-time network environment, and abstracts the TCP congestion control mechanism into a partially observable Markov decision process. Changes in the real-time state of the network are fed back to the agent, and the agent makes action commands to control the size of the congestion window, which will produce a new network state, and the agent will immediately receive a feedback reward value. To guarantee that the actions taken are optimum, the agent’s goal is to obtain the highest feedback reward value. The state space of network characteristics should be designed so that agents can observe enough information to make appropriate decisions. The reward function is designed through a weighted algorithm that enables the agent to balance and optimize throughput and latency. The model parameters of the agent are updated by the proximal policy optimization algorithm, and the truncation function keeps the parameters within a certain range, reducing the possibility of oscillation during gradient descent and ensuring that the training process can converge quickly. Compared to the traditional CUBIC control method, the results show that the TCP-PPO2 policy reduces latency by 11.7–87.5%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call