Over the years, ultrasonic imaging, both in the context of medical and industrial applications, has ushered in the development of many advanced techniques, some based on the use of the phase of the measured signal instead of its amplitude. These techniques use what is called coherence factors to enhance image quality. Leveraging these factors when calculating the image provides an effective solution to the low restitution of small scatterers, where specular reflectors in conventional ultrasonic imaging are strongly restituted, by weighting an amplitude-based beamformer output. In the field of nondestructive testing, this solution generally boils down to multiplying the well-known Total Focusing Method (TFM) image by a coherence factor map which is also obtained through delay-and-sum (DAS). It has also been demonstrated that coherence factor maps, especially in the case of the vector coherence factor (VCF), can be used as interpretable images themselves, which then results in an amplitude-free imaging process with many potential advantages. Additionally, the fusion of multi-view (or multi-mode) TFM images is still of particular interest since the burden of selecting a single or several views for an inspection scenario highly depends on the geometry and position of the defect that needs to be identified. Although sensitivity maps have been introduced to help in the selection of relevant views, no similar tool currently exists for views generated using directly coherence factor maps. In this study, two different merging approaches for multi-view VCF images are explored and evaluated: pixel-by-pixel summation and maximum pixel value selection. A probabilistic threshold, derived from the VCF statistical background, is also defined to enhance the resulting merged view contrast. The images obtained through the proposed methods were evaluated both qualitatively and quantitatively using the Contrast-to-Noise Ratio (CNR). Experiments were conducted on two sets of samples. The first one was a 20 mm thick steel plate with Flat Bottom Holes (FBH) machined on its edge mimicking reflectors used for calibration of a circumferential weld inspection apparatus. The second one was composed of two 19.05 mm (34′′) thick steel plates with EDM notches at several angles centered at the mid-wall or surface-breaking on the backwall. Since the experimental setup, consisting of a 60-element probe, with a center frequency of 7.5 MHz, standing on a 36°Rexolite wedge, involved exciting mostly shear waves, only views composed of the transversal wave paths were considered for merging. Results show that by applying either one of the proposed merging methods, all artificial defects are reconstructed with a satisfactory CNR (above 30 dB), irrespective of the shape or location of the defects. Overall, the pixel-by-pixel summation and the maximum pixel value methods gave similar results. Applying the proposed threshold on the merged view improved the CNR by up to 15.8 dB. Finally, these simple but generic and robust multi-view merging methods and their associated thresholds could be employed while using a simplified hardware setup, as demonstrated previously in the literature.