Predictive intelligence of reliable analytics in distributed computing environments

Yiannis Kathidjiotis,Christos Anagnostopoulos,Kostas Kolomvatsos

doi:10.1007/s10489-020-01712-5

Yiannis Kathidjiotis, Christos Anagnostopoulos + Show 1 more

Open Access

https://doi.org/10.1007/s10489-020-01712-5

Copy DOI

Journal: Applied Intelligence	Publication Date: May 14, 2020
Citations: 6	License type: open-access

Affiliation: University of Glasgow

Abstract

Lack of knowledge in the underlying data distribution in distributed large-scale data can be an obstacle when issuing analytics & predictive modelling queries. Analysts find themselves having a hard time finding analytics/exploration queries that satisfy their needs. In this paper, we study how exploration query results can be predicted in order to avoid the execution of ‘bad’/non-informative queries that waste network, storage, financial resources, and time in a distributed computing environment. The proposed methodology involves clustering of a training set of exploration queries along with the cardinality of the results (score) they retrieved and then using query-centroid representatives to proceed with predictions. After the training phase, we propose a novel refinement process to increase the reliability of predicting the score of new unseen queries based on the refined query representatives. Comprehensive experimentation with real datasets shows that more reliable predictions are acquired after the proposed refinement method, which increases the reliability of the closest centroid and improves predictability under the right circumstances.

Highlights

Due to the importance and relevance of data in distributed computing environments, large-scale data analytics, predictive modelling, and exploration tasks, they have rightfully found their place in almost all, if not all, of today’s industries
Apart from the frustration that might be involved in finding the correct query, executing the aforementioned queries can lead to the waste of network and storage resources that are involved in transferring and storing query results among computing nodes in a distributed computing environment
We hypothesize whether we can determine if a query is worth executing based on score prediction and user criteria in distributed computing environments based on query-driven mechanisms

Summary

Introduction

Due to the importance and relevance of data in distributed computing environments, large-scale data analytics, predictive modelling, and exploration tasks, they have rightfully found their place in almost all, if not all, of today’s industries. Exploration querying acts as a solution for accessing distributed data, in most cases there is lack of knowledge about the underlying data distributions and their impact on the results. Apart from the frustration that might be involved in finding the correct query, executing the aforementioned queries can lead to the waste of network and storage resources that are involved in transferring and storing query results among computing nodes in a distributed computing environment (including processed data or even raw data for analytics tasks)

Objectives

Methods

Findings

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Predictive intelligence of reliable analytics in distributed computing environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Intelligence

Lead the way for us

Similar Papers

Fault recorder data refinement for accurate fault location in a transmission system
Vatee Laoharojanaphand ... Naebboon Hoonchareon
-
Vatee Laoharojanaphand, et. al.Vatee Laoharojanaphand ... Naebboon Hoonchareon
01 Sep 2012
01 Sep 2012

Non-aligned double JPEG compression detection based on refined Markov features in QDCT domain
Jinwei Wang ... Xiangyang Luo
Journal of Real-Time Image Processing | VOL. 17
Jinwei Wang, et. al.Jinwei Wang ... Xiangyang Luo
26 Nov 2019
Journal of Real-Time Image Processing | VOL. 17

Refinement of the protein backbone angle psi in NMR structure calculations.
R Sprangers ... J Schultz
Journal of biomolecular NMR | VOL. 16
R Sprangers, et. al.R Sprangers ... J Schultz
01 Jan 1999
Journal of biomolecular NMR | VOL. 16

Tighter alpha BB relaxations through a refinement scheme for the scaled Gerschgorin theorem
Dimitrios Nerantzis ... Claire S Adjiman
Journal of Global Optimization | VOL. 73
Dimitrios Nerantzis, et. al.Dimitrios Nerantzis ... Claire S Adjiman
11 Jan 2019
Journal of Global Optimization | VOL. 73

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Predictive intelligence of reliable analytics in distributed computing environments

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Intelligence