The great advance and affordability of technologies, communications and sensor technology has led to the generation of large amounts of data in the field of the Internet of Things and smart environments, as well as a great demand for smart applications and services adapted to the specific needs of each individual. This has entailed the need for systems capable of receiving, routing and processing large amounts of data to detect situations of interest with low latency, but despite the many existing works in recent years, studying highly scalable and low latency data processing systems is still necessary. In this area, the efficiency of complex event processing (CEP) technology is of particular significance and has been used in a variety of application scenarios. However, in most of these scenarios there is no performance evaluation to show how the system performs under various loads and therefore the developer is challenged to develop such CEP-based systems in new scenarios without knowing how the system will be able to handle different input data rates and address scalability and fault tolerance. This article aims to fill this gap by providing an evaluation of the various versions of one of the most reputable CEP engines-Esper CEP, as well as its integration with two renowned messaging brokers for data ingestion-RabbitMQ and Apache Kafka. For this purpose, we defined a benchmark with a series of event patterns with some of the most representative operators of the Esper CEP engine and we performed a series of tests with an increasing rate of input data to the system. We did this for three alternative software architectures: integrating open-source Esper and RabbitMQ, integrating one instance of Esper enterprise edition with Apache Kafka, and integrating two distributed instances of Esper enterprise edition with Apache Kafka. We measured the usage of CPU, RAM memory, latency and throughput time, looking for the data input rate with which the system overloads for each event pattern and we compared the results of the three proposed architectures. The results have shown a very low CPU consumption for all implementation options and input data rates; a balanced memory usage, quite similar among the three architectures, up to an input rate of 10,000 or 15,000 events per second, depending on the architecture and event pattern, and a quite efficient response time up to 10,000 or 15,000 events per second, depending on the architecture and event pattern. Based on a more exhaustive analysis of results, we have concluded that the different options offered by Esper for CEP provide very efficient solutions for real-time data processing, although each with its limitations in terms of brokers to be used for data integration, scalability, and fault tolerance; a number of suggestions have been drawn out for the developer to take as a basis for choosing which CEP engine and which messaging broker to use for the implementation depending on the of the system in question.
Read full abstract