Incremental algorithms are the heart and soul of stream processing. Low latency results depend on the ability to react to the subset of changes in a dataset over time rather than reprocessing the entirety of a dataset as it evolves. But while the SQL language is well suited for representing streams of changes (via tables) and their application to tables over time (via DML), it entirely lacks a method to query the changes to a table or view in the first place. In this paper, we present CHANGES queries and STREAM objects, Snowflake's primitives for querying and consuming incremental changes to table objects over time. CHANGES queries and STREAMs have been in use within Snowflake for three years, and see broad adoption across our customers. We describe the semantics of these primitives, discuss the implementation challenges, present an analysis of their usage at Snowflake, and contrast with other offerings.
Read full abstract