In complex environments, decision-making processes are more and more dependent on gathering, processing and analysis of huge amounts of data, often produced with different velocities and different formats by distributed sensors (human or automatic). Such streams of data also suffer of imprecision and uncertainty. On the other hand, Three-way Decision is considered a suitable approach for data analysis based on the tri-partitioning of the universe of discourse, i.e., exploiting the notions of acceptance, rejection and non-commitment, as well as the human brain does to solve numerous problems. Suppose the application scenario foresees the processing of data streams. In that case, the analysis task could be accomplished by considering the stream computing paradigm which is one of the most important paradigms in Big Data. With such a paradigm data arrives, is processed and departs in real-time without needing to be temporarily serialized into a storage system. This work analyzes the implementation of the Three-Way Decision approach, based on Rough Set Theory, on a real-time data processing platform supporting streaming computing, i.e., Apache Spark.
Read full abstract