Density functional theory is the workhorse of chemistry and materials science, and novel density functional approximations are published every year. To become available in program packages, the novel density functional approximations (DFAs) need to be (re)implemented. However, according to our experience as developers of Libxc [Lehtola et al., SoftwareX 7, 1 (2018)], a constant problem in this task is verification due to the lack of reliable reference data. As we discuss in this work, this lack has led to several non-equivalent implementations of functionals such as Becke-Perdew 1986, Perdew-Wang 1991, Perdew-Burke-Ernzerhof, and Becke's three-parameter hybrid functional with Lee-Yang-Parr correlation across various program packages, yielding different total energies. Through careful verification, we have also found many issues with incorrect functional forms in recent DFAs. The goal of this work is to ensure the reproducibility of DFAs. DFAs must be verifiable in order to prevent the reappearance of the above-mentioned errors and incompatibilities. A common framework for verification and testing is, therefore, needed. We suggest several ways in which reference energies can be produced with free and open source software, either with non-self-consistent calculations with tabulated atomic densities or via self-consistent calculations with various program packages. The employed numerical parameters-especially the quadrature grid-need to be converged to guarantee a ≲0.1 μEh precision in the total energy, which is nowadays routinely achievable in fully numerical calculations. Moreover, as such sub-μEh level agreement can only be achieved when fully equivalent implementations of the DFA are used, the source code of the reference implementation should also be made available in any publication describing a new DFA.