Abstract
With the Internet of Multimedia Things (IoMT) becoming a reality, new approaches are needed to process real-time multimodal event streams. Existing approaches to event processing have limited consideration for the challenges of multimodal events, including the need for complex content extraction, and increased computational and memory costs. This article explores event processing as a basis for processing real-time IoMT data. This article introduces the multimodal event processing (MEP) paradigm, which provides a formal basis for native approaches to neural multimodal content analysis (i.e., computer vision, linguistics, and audio) with symbolic event processing rules to support real-time queries over multimodal data streams using the multimodal event processing language to express single, primitive multimodal, and complex multimodal event patterns. The content of multimodal streams is represented using multimodal event knowledge graphs to capture the semantic, spatial, and temporal content of the multimodal streams. The approach is implemented and evaluated within a MEP engine using single and multimodal queries achieving near real-time performance with a throughput of ~30 frames processed per second (fps) and subsecond latency of 0.075–0.30 s for video streams of 30 fps input rate. Support for high input stream rates (45 fps) is achieved through content-aware load-shedding techniques with a ~127X latency improvement resulting in only a minor decrease in accuracy.
Highlights
With the rise of the Internet of Multimedia Things and Smart Environments, there has been a significant shift in the nature of data streams
The Internet of Multimedia Things (IoMT) is recently coined to represent multimedia communications using the Internet of Things (IoT)
In Multimodal Event Processing (MEP), we have extended the event processing concept of the event to define the multimodal event
Summary
With the rise of the Internet of Multimedia Things and Smart Environments, there has been a significant shift in the nature of data streams. The paper explores the use of the event processing paradigm for real-time IoMT data. It introduces the Multimodal Event Processing (MEP) paradigm to meet the critical challenges for. The work builds the previous single modal image-only e event processing [14] to detect patterns over multimodal data streams such as video and audio. The structure of the paper is as follows: section II details the motivation for new forms of processing multimodal data within intelligent environments and identifies the key challenges.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.