Abstract

Trace is widely used to detect anomalies in distributed microservice systems because of the capability of precisely reconstructing user request paths. However, most existing trace-based anomaly detection approaches treat the trace as a sequence of microservice invocations with response time information, which ignores the graph structure of trace and abnormal resource consumption of the complex distributed deployment environment of microservice. In this paper, we propose TraceGra, an unsupervised encoder–decoder anomaly detection approach. TraceGra first provides a unified graph representation to combine traces and performance metrics of the container. Then, it introduces the graph neural network (GNN) and long short-term memory network (LSTM) to extract the topology and temporal features, respectively. Finally, it adds the two-part loss value with two hyperparameters as the anomaly score. The evaluation results on an open-source dataset and a local dataset collected from an ARM server cluster show that TraceGra achieves a high precision (0.97) and recall (0.93), outperforming some state-of-the-art approaches with an average increase of 0.1 in F1-score.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.