Detecting system anomalies is an important problem in many fields such as security, fault management, and industrial optimization. Recently, invariant network has shown to be powerful in characterizing complex system behaviours. In the invariant network, a node represents a system component and an edge indicates a stable, significant interaction between two components. Structures and evolutions of the invariance network, in particular the vanishing correlations, can shed important light on locating causal anomalies and performing diagnosis. However, existing approaches to detect causal anomalies with the invariant network often use the percentage of vanishing correlations to rank possible casual components, which have several limitations: (1) fault propagation in the network is ignored, (2) the root casual anomalies may not always be the nodes with a high percentage of vanishing correlations, (3) temporal patterns of vanishing correlations are not exploited for robust detection, and (4) prior knowledge on anomalous nodes are not exploited for (semi-)supervised detection. To address these limitations, in this article we propose a network diffusion based framework to identify significant causal anomalies and rank them. Our approach can effectively model fault propagation over the entire invariant network and can perform joint inference on both the structural and the time-evolving broken invariance patterns. As a result, it can locate high-confidence anomalies that are truly responsible for the vanishing correlations and can compensate for unstructured measurement noise in the system. Moreover, when the prior knowledge on the anomalous status of some nodes are available at certain time points, our approach is able to leverage them to further enhance the anomaly inference accuracy. When the prior knowledge is noisy, our approach also automatically learns reliable information and reduces impacts from noises. By performing extensive experiments on synthetic datasets, bank information system datasets, and coal plant cyber-physical system datasets, we demonstrate the effectiveness of our approach.