Abstract
Regular expression search is widely used, including in databases. For example, the LIKE operator was included in the SQL standard about thirty years ago. However, the types of indexes commonly used in DBMS are extremely limited for speeding up regular expression search: most of these queries require a full scan of all data. One of the most interesting approaches to developing a specialized index is described in the article [1]. Its authors suggested using a certain subset of substrings of variable - length input data-multigrams-as index keys. In this article, we propose changes to the structure and algorithm for constructing such an index, which allows us to achieve two goals. First, the index becomes applicable to speed up a broader class of queries. Second, the proposed changes made it possible to update the index. We also developed and tested an algorithm for updating the index when inserting new records into the database. This algorithm allows you to get two orders of magnitude lower time for updating the index compared to its complete reconstruction.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.