To cope up with drastically increasing demand for radio resources lead to raise a challenge to the wireless community. The limited radio spectrum and fixed spectrum allocation strategy have become a bottleneck for various wireless communication. Cognitive Radio (CR) technology along with potential benefits of machine learning has attracted substantial research interest especially in the context of spectrum management. However, a variety of performance attributes as objectives draw attention during the technological preparations for spectrum management such as higher spectral efficiency, lower latency, higher network capacity, and better energy efficiency as these objectives are often conflicting with each other. Hence, this paper addresses the spectrum allocation problem concerning network capacity and spectrum efficiency as conflicting objectives and model the scenario as a multi-objective optimization problem in CR networks. An improved version of the Non-dominated Sorting Genetic Algorithm-II (NSGA-II) which combines the feature of evolutionary algorithm and machine learning called Non-dominated Sorting Genetic Algorithm based on Reinforcement Learning (NSGA-RL) is proposed which incorporates a self-tuning parameter approach to handle multiple conflicting objectives. The numerical findings validate the effectiveness of the proposed algorithm through the Pareto optimal set and obtain optimal solution efficiently to satisfy various requirements of spectrum allocation in CR networks.