Abstract

ABSTRACT In today's rushing world, there's an ever-increasing usage of networking equipment. These devices log their operations; however, there could be errors that result in the restart of the given device. There could be different patterns before different errors. Our main goal is to predict the upcoming error based on the log lines of the actual file. To achieve this, we use document similarity. One of the key concepts of information retrieval is document similarity which is an indicator of how analogous (or different) documents are. In this paper, we are studying the effectiveness of prediction based on cosine similarity, Jaccard similarity, and Euclidean distance of rows before restarts. We use different features like TFIDF, Doc2Vec, LSH, and others in conjunction with these distance measures. Since networking devices produce lots of log files, we use Spark for Big data computing.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.