Reference-based super resolution (RefSR) aims to recover the lost details in a low-resolution image and generate a high-resolution result, guided by a high-resolution reference image with similar contents or textures. In contrast to the traditional single-image super-resolution, which focuses on the intrinsic properties of the single low-resolution image, the challenge of RefSR lies in matching and aggregating highly-related but misaligned reference images with low-resolution images. Several effective but complex designs have been proposed to address this challenge, which poses difficulties in implementing RefSR in real-world applications. In order to better understand the working mechanism of RefSR and design a more efficient and lightweight architecture, we provide a review about the essential components of the existing deep learning-based RefSR. We decompose and classify the common pipeline into four submodules according to the functionalities. Then, we summarize and describe the implementation details of the commonly-adopted approaches in each submodule. Finally, we discuss the challenges and promising research directions.