A Ternary Content-Addressable Memory (TCAM) constitutes a memory variant employed within Software Defined Networking (SDN) node flow tables. These TCAMs deliver swift processing, enabling rapid parallel lookups. Nonetheless, due to their high energy consumption and cost, TCAMs have limited dimensions. This size constraint influences rule capacity, and suboptimal rule management can degrade network quality of service. Although different techniques for flow table management have been proposed during recent years, such as eviction, idle and hard timeout mechanisms, this paper proposes a Deep-Reinforcement Learning (DRL) solution, namely DRL-Idle, that is able to maximize the service time of the flows in an SDN network without considering any assumption about their status throughout time. By means of a continuous learning, DRL-Idle is able to also minimize the number of rule installations that are required to serve the target flows. Based on the idea of dynamically modifying the idle timeout value of the flows according to their needs, DRL-Idle outperforms existing approaches aiming at solving the Rule Placement Problem, averaging a 30% increase in performance.