InferDB: In-Database Machine Learning Inference Using Indexes

Ricardo Salazar-Díaz,Tilmann Rabl,Boris Glavic

doi:10.14778/3659437.3659441

Abstract

The performance of inference with machine learning (ML) models and its integration with analytical query processing have become critical bottlenecks for data analysis in many organizations. An ML inference pipeline typically consists of a preprocessing workflow followed by prediction with an ML model. Current approaches for in-database inference implement preprocessing operators and ML algorithms in the database either natively, by transpiling code to SQL, or by executing user-defined functions in guest languages such as Python. In this work, we present a radically different approach that approximates an end-to-end inference pipeline (preprocessing plus prediction) using a light-weight embedding that discretizes a carefully selected subset of the input features and an index that maps data points in the embedding space to aggregated predictions of an ML model. We replace a complex preprocessing workflow and model-based inference with a simple feature transformation and an index lookup. Our framework improves inference latency by several orders of magnitude while maintaining similar prediction accuracy compared to the pipeline it approximates.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

InferDB: In-Database Machine Learning Inference Using Indexes

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment

Lead the way for us

Journal: Proceedings of the VLDB Endowment	Publication Date: Apr 1, 2024
Citations: 2

Similar Papers

Poster: Towards Battery-Free Machine Learning Inference and Model Personalization on MCUs
Yushan Huang ... Hamed Haddadi
-
Yushan Huang, et. al.Yushan Huang ... Hamed Haddadi
18 Jun 2023
18 Jun 2023

Application of Machine Learning Analyses Using Clinical and [18F]-FDG-PET/CT Radiomic Characteristics to Predict Recurrence in Patients with Breast Cancer.
Kodai Kawaji ... Ikumi Kitazono
Molecular Imaging and Biology | VOL. 25
Kodai Kawaji, et. al.Kodai Kawaji ... Ikumi Kitazono
16 May 2023
Molecular Imaging and Biology | VOL. 25

Machine Learning Models for Blood Glucose Level Prediction in Patients With Diabetes Mellitus: Systematic Review and Network Meta-Analysis.
Kui Liu ... Changsheng Chen
JMIR Medical Informatics | VOL. 11
Kui Liu, et. al.Kui Liu ... Changsheng Chen
20 Nov 2023
JMIR Medical Informatics | VOL. 11

Partitioning of green-blue water fluxes around the world: ML model explainability and predictability
Daniel Althoff ... Georgia Destouni
-
Daniel Althoff, et. al.Daniel Althoff ... Georgia Destouni
28 Mar 2022
28 Mar 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

InferDB: In-Database Machine Learning Inference Using Indexes

Abstract

Talk to us

Similar Papers

More From: Proceedings of the VLDB Endowment