Fine-grained urban flow inference (FUFI) aims to infer the coarse-grained (CG) urban flow map to the corresponding fine-grained (FG) one, which plays an important role in efficient traffic monitoring and management in smart cities. In FUFI, the CG map can be obtained with only a small number of monitoring devices, greatly reducing the overhead of deploying devices and the costs of maintenance, labor, and electricity. Existing FUFI methods are mainly based on techniques from image super-resolution (SR) models, which cannot fully consider the influence of external factors and face the ill-posed problem in SR tasks. In this paper, we propose UFI-Flow, a novel approach for addressing the FUFI problem by learning the conditional distributions of CG and FG map pairs. Given the CG map and the latent variables, the corresponding FG map is inferred by invertible transformations. In addition, an augmented distribution fusion mechanism is further proposed to constrain the urban flow distribution within the influence of external factors. We provide a new large-scale real-world FUFI dataset and show that UFI-Flow significantly outperforms the strong baselines.