The apple fruits captured from red green and blue (RGB) color camera cannot be accurately identified and localized under natural scenes of orchards due to lighting and shading. We propose a pulse coupled neural network (PCNN) orchard heterogeneous image fusion model for semantic segmentation of saliency regions. The purpose of the model is to improve the accuracy of image segmentation and the quality of heterogeneous source image fusion in complex orchard scenes. The model fuses the time of flight (ToF) and RGB images acquired from orchard scenes. This model consists of an orchard image semantic segmentation module, a parameter optimization module, and a dual PCNN image fusion module. Image semantic segmentation module extracts the significant regions of fruit targets in orchard images. And then, in parameter optimization module, we introduce tent chaos and ranking strategies into human mental search algorithm to optimize the parameters of PCNN model. Finally, we complete the fusion through a dual PCNN module. Six fusion algorithms were used to evaluate the fusion quality of public and self-built datasets. The results showed that the chaotic hierarchical mental search algorithm could better solve the problem of difficult-to-determine PCNN parameters. The improved DeeplabV3 + could also segment clearer and more complete apple regions. Compared with other six fusion algorithms, the fusion experimental results show that the fused images with clearer targets and more prominent textures could be obtained in the proposed model. The proposed model reduces the influence of light and other disturbing factors in fruit recognition. And the fused image contains rich and accurate image information, which is more in line with the perception habit of human eyes.