In this paper, an ultra-low power fast-convergence CORDIC processor is proposed for power-constrained applications. Existing fast-convergence CORDIC methods are reviewed. Effective design techniques are proposed to maximize the energy efficiency of the proposed CORDIC processor from algorithm, architecture, to circuit levels. At the algorithm level, an efficient Pipeline Angle Recoding method is proposed to not only significantly reduce the redundant iterations by more than 50%, but also balance the computation load between the optimum angle selection and elementary angle rotation. At the architecture level, efficient pipeline operation is implemented to achieve near 100% hardware utilization and scaling operation during rotations, by scheduling and balancing the pipeline operations between the optimum angle selection unit and the main CORDIC rotation unit with in-rotation scaling logic. The optimum angle selection enable fast-convergence computation and reduce dynamic energy significantly, while the balanced pipeline operation maximize the hardware utilization and also minimize the leakage energy especially at ultra-low voltages in sub/near-threshold regions. Silicon measurement results show that the proposed CORDIC processor fabricated in 0.18-μm CMOS technology can operate from 1 V down to 0.25 V and achieve ultra-low energy consumption at sub-100 pJ when operating in the near/sub-threshold region from 0.57 V to 0.25 V, which is suitable for middle- and low-data rate power-constrained applications. Compared with the conventional design at 1.8 V, the proposed ultra-low voltage CORDIC processor design achieves a 30× reduction in energy consumption and only consumes 35 pJ per CORDIC operating at the minimum energy point of 30 kHz and 0.29 V, making it very suitable for stringently power-constrained applications including wearable healthcare and remote environmental monitoring applications.