Ptype: probabilistic type inference

Taha Ceritli,James Geddes,Christopher K I Williams

doi:10.1007/s10618-020-00680-1

Taha Ceritli, James Geddes + Show 1 more

Open Access

https://doi.org/10.1007/s10618-020-00680-1

Copy DOI

Abstract

Type inference refers to the task of inferring the data type of a given column of data. Current approaches often fail when data contains missing data and anomalies, which are found commonly in real-world data sets. In this paper, we propose ptype, a probabilistic robust type inference method that allows us to detect such entries, and infer data types. We further show that the proposed method outperforms existing methods.

Highlights

Data analytics can be defined as the process of generating useful insights from raw data sets
Constructing Probabilistic Finite-State Machines (PFSMs) for complex types might require more human engineering than the other types. We reduce this need by building such PFSMs automatically from corresponding regular expressions
To measure the performance on type/non-type inference, we report Area Under Curve (AUC) of Receiver Operating Characteristic (ROC) curves, as well as the percentages of TPs, FPs and FNs

Summary

Introduction

Data analytics can be defined as the process of generating useful insights from raw data sets. Central to the entire process is the concept of data wrangling, which refers to the task of understanding, interpreting, preparing a raw data set and turning it into a usable format. This task can lead to a frustrating and time-consuming process for large data sets, and even possibly for some small sized ones (Dasu and Johnson 2003). Raw data often do not contain any well-documented prior information such as meta-data

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Data Mining and Knowledge Discovery	Publication Date: Mar 16, 2020
Citations: 5	License type: open-access

R Discovery Prime

R Discovery Prime

Ptype: probabilistic type inference

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery

Lead the way for us

Similar Papers

Robust inference methods for meta-analysis involving influential outlying studies.
Hisashi Noma ... Toshi A Furukawa
Statistics in medicine | VOL. 43
Hisashi Noma, et. al.Hisashi Noma ... Toshi A Furukawa
20 Jun 2024
Statistics in medicine | VOL. 43

SYNTHESIS OF INFORMATION TECHNOLOGIES FOR DECISION SUPPORT UNDER UNCERTAINTY: PROBABILISTIC ASPECTS
Nadiya Honcharova ... Alyona Shved
-
Nadiya Honcharova, et. al.Nadiya Honcharova ... Alyona Shved
01 Jan 2023
01 Jan 2023

Probabilistic Inference of Cascading Commutation Failures of Multi-HVDCs Based on Bayesian Theory
Hao Yu ... Xiaobing Zou
-
Hao Yu, et. al.Hao Yu ... Xiaobing Zou
11 Nov 2022
11 Nov 2022

Coarse-to-Fine Inference and Learning for First-Order Probabilistic Models
Chloe Kiddon ... Pedro Domingos
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 25
Chloe Kiddon, et. al.Chloe Kiddon ... Pedro Domingos
04 Aug 2011
Proceedings of the AAAI Conference on Artificial Intelligence | VOL. 25

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Ptype: probabilistic type inference

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Data Mining and Knowledge Discovery