Abstract

On-line statistical and machine learning analytic tasks over large-scale contextual data streams coming from e.g., wireless sensor networks, Internet of Things environments, have gained high popularity nowadays due to their significance in knowledge extraction, regression and classification tasks, and, more generally, in making sense from large-scale streaming data. The quality of the received contextual information, however, impacts predictive analytics tasks especially when dealing with uncertain data, outliers data, and data containing missing values. Low quality of received contextual data significantly spoils the progressive inference and on-line statistical reasoning tasks, thus, bias is introduced in the induced knowledge, e.g., classification and decision making. To alleviate such situation, which is not so rare in real time contextual information processing systems, we propose a progressive time-optimized data quality-aware mechanism, which attempts to deliver contextual information of high quality to predictive analytics engines by progressively introducing a certain controlled delay. Such a mechanism progressively delivers high quality data as much as possible, thus eliminating possible biases in knowledge extraction and predictive analysis tasks. We propose an analytical model for this mechanism and show the benefits stem from this approach through comprehensive experimental evaluation and comparative assessment with quality-unaware methods over real sensory multivariate contextual data.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.