Stream mining systems have received a great deal of attention in recent years. These systems process incoming data streams from different sources and extract high-level semantic features from them. They do this by passing data streams through an ensemble of classifiers. Owing to dynamic changes in characteristics of the data streams, these classifiers need to be configured dynamically to maximize the performance of the system. As a challenge, different data streams from different sources have different specifications from each other. This causes treating all the incoming data streams identically by a common topology configuration to be not appropriate for an optimal stream mining. Hence, an approach is required which allows each data stream to be processed by consideration of its own specifications. In this paper, by implementing a buffer for each source and using a time-sharing solution, we propose a distributed approach to solve the aforementioned problem for cascaded classifier topologies. We first formally define a utility metric which captures both the performance and the delay of a binary filtering classifier system. We then propose our solution for a base case and evolve it step by step until reaching the most general case for cascaded topologies. We finally test and compare our approach with the state-of-the-art solution on a text detection scenario from the incoming video streams to the system.
Read full abstract