Abstract

Critical systems that produce big data streams can require human operators to monitor these event streams for changes of interest. Automated systems which oversee many tasks can still have a need for the ‘human-in-the-loop’ operator to evaluate whether an intervention is required due to a lack of suitable training data initially offered to the system which would allow a correct course of actions to be taken. In order for an operator to be capable of reacting to real-time events, the visual depiction of the event data must be in a form which captures essential associations and is readily understood by visual inspection. A similar requirement can be found during inspections on activity protocols in a large organization where a code of correct conduct is prescribed and there is a need to oversee whether the activity traces match the expectations, with minimal delay. The methodology presented here addresses these concerns by providing an adaptive window sizing measurement for subsetting the data, and subsequently produces a set of network diagrams based upon event label co-occurrence networks. With an intuitive method of network construction the amount of time required for operators to learn how to monitor complex event streams of big datasets can be reduced.

Highlights

  • With the growing amount of information being gathered and stored the associations between the fields becomes more ambiguous, and challenging to track, as the context of queries from the data can utilize different subsets depending upon the application

  • The edge widths shown are the values of the number of walks of length 2 based upon the aggregate of the co-occurrence network aggregate from the event rows described in “Methodology” section

  • The goal is to produce a meaningful representation of the data variable labels based upon event label co-occurrences

Read more

Summary

Introduction

With the growing amount of information being gathered and stored the associations between the fields becomes more ambiguous, and challenging to track, as the context of queries from the data can utilize different subsets depending upon the application. The main difference in the terminology is that it is assumed that the storage of the event logs has correct temporally aligned observation tuples of the information This challenge has been highlighted for more than 20 years as being part of the general task of multisensor data fusion [8] (notes applications for defense as well). With a large search space it is not enough to provide the feasibility for the correct data and insight to be derived, it must account for the understanding that there can be too many features to explore under constrained time limits With this in mind, the big data streams are handled in the proposed methodology in such a way as to allow users to monitor event label data streams within a network diagram production. An overview of this challenge of reasoning from streaming data is provided in [10], and in [11] the challenges for the huge volumes are described primarily in the effort to understand city dynamics which demands real time processing

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call