This paper proposes a coding method for compressing a phase-only hologram video (PoHV), which can be directly displayed in a commercial phase-only spatial light modulator. Recently, there has been active research to use a standard codec as an anchor to develop a new video coding for 3D data such as MPEG point cloud compression. The main merit of this approach is that if a new video codec is developed, the performance of relative coding methods can be increased simultaneously. Furthermore, compatibility is increased by the capability to use various anchor codecs, and the developing time is decreased. This paper uses a currently used video codec as an anchor codec and develops a coding method including progressive scaling and a deep neural network to overcome low temporal correlation between frames of a PoHV. Since it is difficult to temporally predict a correlation between frames of a PoHV, this paper adopts a scaling function and a neural network in the encoding and decoding process, not adding complexity to an anchor itself to predict temporal correlation. The proposed coding method shows an enhanced coding gain of an average of 22%, compared with an anchor in all coding conditions. When observing numerical and optical reconstructions, the result images by the proposed show clearer objects and less juddering than the result by the anchor.