Supporting Uncertain Predicates in DBMS Using Approximate String Matching and Probabilistic Databases

Amol S. Jumde,Ravindra B. Keskar

doi:10.1109/access.2020.3021945

Amol S. Jumde, Ravindra B. Keskar

Open Access

https://doi.org/10.1109/access.2020.3021945

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 2	License type: CC BY 4.0

Affiliation: Visvesvaraya National Institute of Technology

Abstract

Current relational database systems are deterministic in nature and lack the support for approximate matching. The result of approximate matching would be the tuples annotated with the percentage of similarity but the existing relational database system can not process these similarity scores further. In this paper, we propose a system to support approximate matching in the DBMS field. We introduce a `≈' (uncertain predicate operator) for approximate matching and devise a novel formula to calculate the similarity scores. Instead of returning an empty answer set in case of no match, our system gives ranked results thereby providing a glance at existing tuples closely matching with the queried literals. Two variants of the `≈' operator are also introduced for numeric data: `≈+' for higher-the-better and `≈-' for lower-the-better cases. Efficient approximate string matching methods are proposed for matching string-type data whereas numeric closeness is used for other types of data (date, time, and number). We also provide results of our system taken over several sample queries that illustrate the significance of our system. All experiments are performed using the MySQL database, whereas the IMDb movie database and European Football database are used as sample datasets.

Highlights

In traditional databases, select, from, and where are the fundamental clauses of any SQL query
We present the queries to calculate the above-mentioned distance by utilizing the in-built features provided by the DBMS (Section IV)
Probability Calculation Module combines the probabilities obtained from the uncertain predicates to calculate the final probability of filtered tuples (Fig. 4(f)) and we get a probabilistic database as an output (Fig. 4(g))

Summary

Introduction

Select, from, and where are the fundamental clauses of any SQL query. General SQL query takes relations specified in the from clause as an input, removes tuples which do not satisfy the predicates in the where clause, and selects the attributes specified in the select clause. The query returns a filtered relation as an output. A. PATTERN MATCHING IN DETERMINISTIC DATABASE Pattern matching in a deterministic database is performed using the like operator. Patterns are described using two special characters ‘%’ and ‘_’. The ‘%’ matches any string i.e., any number of characters and ‘_’ matches a single character. SQL uses the like operator to express the pattern. This query returns the title of books containing ‘Computer’ as a substring

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Supporting Uncertain Predicates in DBMS Using Approximate String Matching and Probabilistic Databases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Implementation of Bit-Vector Algorithm for Approximate String Matching on Rhodopsin Protein Sequence
Theophilus Wellem ... Yessica Nataliani
International Journal of Computer Applications | VOL. 72
Theophilus Wellem, et. al.Theophilus Wellem ... Yessica Nataliani
26 Jun 2013
International Journal of Computer Applications | VOL. 72

LibFLASM: a software library for fixed-length approximate string matching.
Lorraine A K Ayad ... Ahmad Retha
BMC Bioinformatics | VOL. 17
Lorraine A K Ayad, et. al.Lorraine A K Ayad ... Ahmad Retha
10 Nov 2016
BMC Bioinformatics | VOL. 17

Approximate String Matching Algorithms: A Brief Survey and Comparison
Syeda Shabnamhasan ... Rosina Surovi Khan
International Journal of Computer Applications | VOL. 120
Syeda Shabnamhasan, et. al.Syeda Shabnamhasan ... Rosina Surovi Khan
18 Jun 2015
International Journal of Computer Applications | VOL. 120

Approximate String Matching
Patrick A V Hall ... Geoff R Dowling
ACM Computing Surveys | VOL. 12
Patrick A V Hall, et. al.Patrick A V Hall ... Geoff R Dowling
01 Dec 1980
ACM Computing Surveys | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Supporting Uncertain Predicates in DBMS Using Approximate String Matching and Probabilistic Databases

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access