The convergence properties for reinforcement learning approaches, such as temporal differences and Q-learning, have been established under moderate assumptions for discrete state and action spaces. In practice, however, many systems have either continuous action spaces or a large number of discrete elements. This paper presents an approximate dynamic programming approach to reinforcement learning for continuous action set-point regulator problems, which learns near-optimal control policies based on scalar performance measures. The continuous-action space (CAS) algorithm uses derivative-free line search methods to obtain the optimal action in the continuous space. The theoretical convergence properties of the algorithm are presented. Several heuristic stopping criteria are investigated and practical application is illustrated by two example problems—the inverted pendulum balancing problem and the power system stabilization problem.