Many detection methods based on convolutional neural networks (CNNs) have been proposed for image splicing forgery detection. Most of these methods focus on validating local patches or local objects. We regard image splicing forgery detection as a binary classification task that distinguishes tampered and non-tampered regions by forensic fingerprints rather than semantic features. As the network goes deep, its representation ability becomes strong. However, the non-semantic forensic fingerprints can hardly be retained by normal CNNs in deep layers. We proposed a novel dual-encoder network (D-Net) for image splicing forgery detection to resolve these issues, employing an unfixed and a fixed encoder. The unfixed encoder autonomously learns the image fingerprints that differentiate between the tampered and non-tampered regions, whereas the fixed encoder intentionally provides structural information that assists the learning and detection of the forgeries. This dual-encoder is followed by a spatial pyramid global-feature extraction module that expands the global insight of D-Net for classifying the tampered and non-tampered regions more accurately. In an experimental comparison study of D-Net and state-of-the-art methods, D-Net, without pre-training or training on a large number of forgery images, outperformed the other methods in pixel-level forgery detection. Moreover, it is stably robust to different anti-forensic attacks.