Abstract

While there has been a lot of attention paid recently to big data, in which data is written to massive repositories for later analysis, there also is a rapidly increasing amount of data available in the form of data streams or events. Data streams typically represent very recent measurements or current system states. Events represent things that happen, often in the context of computer processing. When processing data streams or events, we often need to make decisions in real time. Complex event processing (CEP) is an important area of computer science that provides powerful tools for processing events and analyzing data streams. CEP deals with events that can be comprised of other events and can model complex phenomena like a user's interactions with a web site or a stock market crash. In the current literature, CEP is almost entirely deterministic, that is, it does not account for randomness or rely on statistical methods. However, statistics and machine learning have a critical role to play in the use of data streams and events. Also, understanding how CEP works is critical to analyzing data based on complex events. When processing data streams, a distinction must be made between analysis, the human activity in which we try to gain understanding of an underlying process, and decision making, in which we apply knowledge to data to decide what action to take. Useful statistical techniques for data streams include smoothing, generalized additive models, change point detection, and classification methods. WIREs Comput Stat 2016, 8:5–26. doi: 10.1002/wics.1372This article is categorized under: Data: Types and Structure > Streaming Data Statistical Learning and Exploratory Methods of the Data Sciences > Streaming Data Mining Statistical and Graphical Methods of Data Analysis > Statistical Graphics and Visualization

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call