There has been tremendous interest in the development of DC microgrid systems which consist of interconnected DC renewable energy sources. However, operating a DC microgrid system optimally by minimizing operational cost and ensures stability remains a problem when the system’s model is not available. In this paper, a novel model-free approach to perform operation control of DC microgrids based on reinforcement learning algorithms, specifically Q-learning and Q-network, has been proposed. This approach circumvents the need to know the accurate model of a DC grid by exploiting an interaction with the DC microgrids to learn the best policy, which leads to more optimal operation. The proposed approach has been compared with with mixed-integer quadratic programming (MIQP) as the baseline deterministic model that requires an accurate system model. The result shows that, in a system of three nodes, both Q-learning (74.2707) and Q-network (74.4254) are able to learn to make a control decision that is close to the MIQP (75.0489) solution. With the introduction of both model uncertainty and noisy sensor measurements, the Q-network performs better (72.3714) compared to MIQP (72.1596), whereas Q-learn fails to learn.