In this paper, we propose a Deep Reinforcement learning based approach for Learning to rank task. Reinforcement Learning has been applied in the ranking task with good success, but the existing Policy Gradient based approaches suffer from noisy gradients and high variance, resulting in unstable learning. The natural policy gradient methods like REINFORCE perform Monte Carlo sampling, thus taking samples randomly, which leads to high variance. As the action space becomes large, i.e., with a very large number of documents, traditional RL techniques lack the complex model required in the scenario to deal with a large number of items. We propose a Deep Reinforcement learning based approach for learning to rank task in this paper to address these issues. By combining Deep learning with the Reinforcement Learning framework, our approach can learn a complex function as deep neural networks can provide significant function approximation. We used Actor-Critic framework where the critic network can reduce variance by utilizing techniques such as clipped delayed policy updates, clipped double q learning, etc. Also, due to the enormous space of the web, the most relevant results are needed to be returned for the corresponding query from within a large action space. Policy gradient algorithms have been effectively applied to problems in large action spaces(items) with deep neural networks as they don’t rely on finding value for each action(item) as in value-based methods. Further, we use an actor-network with a CNN layer in the ranking process to capture the sequential patterns among the documents. We utilize the TD3 method to train our Reinforcement Learning agent with a listwise loss function, which performs delayed policy updates resulting in value estimates with lower variance. To the best of our knowledge, this is the first Deep reinforcement learning method applied in Learning to Rank for document retrieval. We performed experiments on the various Letor datasets and showed that our method outperforms various state-of-the-art baselines.