Simulation Research on Fast Matching of Big Data Based on Spark

Guojian Xu,Mingyang Song,Zhenggang Leng,Zhenhong Jia

doi:10.1109/access.2023.3262989

Abstract

To solve the problem of low efficiency in real-time processing and matching of CNAME records in massive DNS log data, a parallel AC automaton enhancement method based on Spark was proposed. The method is based on the Spark distributed cluster computing engine of Hadoop, which ensures the stability of massive DNS log data storage with high fault tolerance and 24-hour real-time processing. At the same time, the Spark distributed cluster uses the multi-thread parallel computing method combined with the improved AC automaton algorithm, which not only reduces the memory occupied by trie construction, but also improves the efficiency of rapid matching of CNAME records of massive DNS logs. Simulation results show that the proposed method can quickly match CNAME records of massive DNS log data. Compared with the original AC algorithm, the efficiency is significantly improved, and the time complexity and storage space are reduced.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Simulation Research on Fast Matching of Big Data Based on Spark

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Journal: IEEE Access	Publication Date: Jan 1, 2023
License type: CC BY-NC-ND 4.0

Similar Papers

A real-time distributed cluster storage optimization for massive data in internet of multimedia things
Yanning Zhang ... Shuai Liu
Multimedia Tools and Applications | VOL. 78
Yanning Zhang, et. al.Yanning Zhang ... Shuai Liu
08 Dec 2018
Multimedia Tools and Applications | VOL. 78

A Quick Sorting Algorithm Adaptive to Massive Data with High Repetition Rate
Bao Ping Chen
Advanced Materials Research | VOL. 834-836
Bao Ping ChenBao Ping Chen
01 Oct 2013
Advanced Materials Research | VOL. 834-836

Efficient Storage and Parallel Query of Massive XML Data in Hadoop
Wei Yan
-
Wei YanWei Yan
01 Jan 2019
01 Jan 2019

An Efficient Storage Architecture Based on Blockchain and Distributed Database for Public Security Big Data
Duoyue Liao ... Hongmu Han
-
Duoyue Liao, et. al.Duoyue Liao ... Hongmu Han
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Simulation Research on Fast Matching of Big Data Based on Spark

Abstract

Talk to us

Similar Papers

More From: IEEE Access