Abstract
Under a distributed information system, the scale of various operational components such as applications, operating systems, databases, servers, and networks is immense, with intricate access relationships. The silo effect of each professional is prominent, and the linkage mechanism is insufficient, making it difficult to locate the infrastructure components that cause exceptions under a particular application. Current research only plays a role in local scenarios, and its accuracy and generalization are still very limited. This paper proposes a novel fault location method based on dynamic operation maps and alarm common point analysis. During the fault period, various alarm entities are associated with dynamic operation maps, and alarm common points are obtained based on graph search addressing methods, covering deployment relationship common points, connection common points (physical and logical), and access flow common points. This method, compared with knowledge graph approaches, eliminates the complex process of knowledge graph construction, making it more concise and efficient. Furthermore, in contrast to indicator correlation analysis methods, this approach supplements with configuration correlation information, resulting in more precise positioning. Through practical validation, its fault hit rate exceeds 82%, which is significantly better than the existing main methods.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.