Abstract
Network stream analysis is one of the essential applications of industrial research in the era of big data. As the input format of the major massive data application platform--Hadoop, cannot support network stream sufficiently. This paper proposes a feasible optimization design. Firstly, the HDFS block-storage structure and the particular libpcap file format of network stream are considered. Then input files were pre-processed as large as HDFS block-size, and a new data input format called blockPcapInputFormat is achieved by expanding the fileInputFormat of Hadoop. Furthermore, experiments are performed for verifying the proposed design’ effectiveness. Results have shown that the optimization scheme is not only able to accelerate the processing performance of libpcap files effectively, but also suitable for applications where Hadoop parses network stream.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.