MmWave communications systems usually rely on analog or hybrid analog/digital architectures and thus need a predefined codebook to perform beamforming. Traditional codebooks are designed for universal environments, although in practice a particular BS will only serve a particular environment. In this paper, we propose novel site-specific codebook design methods by utilizing the visual information captured through cameras. Different from other site-specific codebook design methods that require a large amount of measured channel state information (CSI), the proposed ones need only a simple snapshot of the environment followed by efficient computer vision (CV) techniques. Thus the proposed CV-aided codebook design reduces the overhead of communications system, such as the cost of time, human resources, as well as the hardware installation and calibration. Specifically, we propose a CV-based approach that detects the LOS area around the BS and reconstructs the LOS channel vectors set (CVS).With this knowledge, we build a vision-based beam codebook using Lloyd algorithm. Further, we design a FusionNet to generate the codebook that can serve the non-line-of-sight (NLOS) users. The simulation results demonstrate the effectiveness of the proposed CV-aided codebook design methods and their superiority compared to the conventional methods.