Recent advancements in deep neural networks especially GAN (Generative Adversarial Network) have resulted in the creation of more realistic deepfake media. This technology can swap the source person's face or alter facial expressions in an image or video; Media manipulated in such a way is termed deepfake. This type of manipulated media poses potential risks to journalism, politics, court proceedings and various social aspects. While existing approaches concentrate on the use of deep neural networks to directly extract facial artefacts for deepfake detection, they do not examine subtle inconsistencies in/across frame/frames. Moreover, state-of-the-art deepfake detection networks appear more complex and tend to overfit on specific artefacts which limits their generalizability on unseen data. This paper proposed a novel technique that tackles the problem of manipulated face detection in videos and images by exploiting the noise pattern inconsistency amongst face region and rest of the frame. To enable a comparison between the noise patterns of these two regions, we propose a two-stream Siamese-like network called SiamNet. This network can extract the noise patterns of the face region and patch through separate streams without increasing the number of parameters, thereby enhancing its efficiency and effectiveness. Each branch consists of pretrained Inception-v3 architecture for camera noise extraction. Siamese training is utilized to compare both noise patterns computed through different base models. The proposed two-branch network, SiamNet is found efficient for several large-scale deepfake datasets such as FF++, Celeb-DF, DFD and DFDC achieving accuracy rates of 99.7%, 98.3%, 96.08% and 89.2% respectively. Furthermore, the proposed technique exhibits greater generalizability and outperforms state-of-the-art of deepfake detection methods. Performance of the proposed model is also evaluated on FaceForensics benchmark dataset against different approaches.