This paper proposes model-based and model-free policy gradient methods (PGMs) for designing dynamic output feedback controllers for discrete-time partially observable deterministic systems without noise. To fulfill this objective, we first show that any dynamic output feedback controller design is equivalent to a state-feedback controller design for a newly introduced system whose internal state is a finite-length input–output history (IOH). Next, based on this equivalence, we propose a model-based PGM and show its global linear convergence by proving that the Polyak–Łojasiewicz inequality holds for a reachability-based lossless projection of the IOH dynamics. Moreover, we propose a model-free implementation of the PGM with a sample complexity analysis. Finally, the effectiveness of the model-based and model-free PGMs is investigated through numerical simulations.