The fusion technology, combining infrared and visible modes, has the potential to enhance the semantic content of backgrounds, thereby improving scene interpretability. However, most existing image fusion algorithms primarily concentrate on the fusion process, often neglecting the importance of preprocessing the source images to enhance their visual fidelity. Additionally, these algorithms frequently overlook the distinct characteristics of infrared and visible modes, leading to suboptimal weight allocations that do not correspond with human perception. To tackle these issues, this paper proposes a fusion algorithm that emphasizes visual fidelity and the rectification of salient information. More specifically, we improve fusion algorithms by designing an adaptive enhancement method based on Taylor approximation and visual compensation, which proves particularly effective in complex environments. Our proposed multi-scale decomposition approach extracts salient information from the transmission map, thereby enriching fusion results with finer details to accentuate target features. Drawing inspiration from the distinctive attributes of infrared and visible image modes, we devise a fusion weight calculation method grounded in similarity measurements to effectively convey significant information from the source images. To validate the effectiveness of our proposed method, we conducted validation experiments using publicly available datasets. Our experimental findings exhibit a prominent advantage over fifteen state-of-the-art fusion algorithms in both subjective and objective assessments. Our code is publicly available at: https://github.com/VCMHE/FVC_SIR.