Trident: Distributed Storage, Analysis, and Exploration of Multidimensional Phenomena

Matthew Malensek,Sangmi Lee Pallickara,Ryan Stern,Shrideep Pallickara,Walid Budgaga

doi:10.1109/tbdata.2018.2817505

Abstract

Rising storage and computational capacities have led to the accumulation of voluminous datasets. These datasets contain insights that describe natural phenomena, usage patterns, trends, and other aspects of complex, real-world systems. Statistical and machine learning models are often employed to identify these patterns or attributes of interest. However, a wide array of potentially relevant models and parameterizations exist, and may provide the best performance only after preprocessing steps have been carried out. Our distributed analytics platform, Trident, facilitates the modeling process by providing high-level data exploration functionality as well as guidance for creation of effective models. Trident handles (1) data partitioning and storage, (2) metadata extraction and indexing, and (3) selective retrievals or transformations to prepare and generate training data. In this study, we evaluate Trident in the context of a 1.1 petabyte epidemiology dataset generated by a disease spread simulation; such datasets are often used in planning for national-scale outbreaks in animal populations.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Trident: Distributed Storage, Analysis, and Exploration of Multidimensional Phenomena

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Big Data

Lead the way for us

Journal: IEEE Transactions on Big Data	Publication Date: Jun 1, 2019
Citations: 2

Similar Papers

Comparing Machine Learning Models and Statistical Models for Predicting Heart Failure Events: A Systematic Review and Meta-Analysis.
Zhoujian Sun ... Lechao Cheng
Frontiers in Cardiovascular Medicine | VOL. 9
Zhoujian Sun, et. al.Zhoujian Sun ... Lechao Cheng
06 Apr 2022
Frontiers in Cardiovascular Medicine | VOL. 9

Spatio-temporal simulation and prediction of land-use change using conventional and machine learning models: a review.
Maher Milad Aburas ... Mohd Sanusi S Ahamad
Environmental Monitoring and Assessment | VOL. 191
Maher Milad Aburas, et. al.Maher Milad Aburas ... Mohd Sanusi S Ahamad
05 Mar 2019
Environmental Monitoring and Assessment | VOL. 191

Discriminating Postural Control Behaviors from Posturography with Statistical Tests and Machine Learning Models: Does Time Series Length Matter?
Luiz H F Giovanini ... Julio C Nievola
-
Luiz H F Giovanini, et. al.Luiz H F Giovanini ... Julio C Nievola
01 Jan 2018
01 Jan 2018

AI-Based Forecasting of Polymer Properties for High-Temperature Butyl Acrylate Polymerizations.
Jelena Fiosina ... Marco Drache
ACS polymers Au | VOL. 4
Jelena Fiosina, et. al.Jelena Fiosina ... Marco Drache
26 Jul 2024
ACS polymers Au | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Trident: Distributed Storage, Analysis, and Exploration of Multidimensional Phenomena

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Big Data