Abstract

The availability of wearable cameras in the consumer market has motivated the users to record their daily life activities and post them on the social media. This exponential growth of egocentric videos demand to develop automated techniques to effectively summarizes the first-person video data. Egocentric videos are commonly used to record lifelogs these days due to the availability of low cost wearable cameras. However, egocentric videos are challenging to process due to the fact that placement of camera results in a video which presents great deal of variation in object appearance, illumination conditions, and movement. This paper presents an egocentric video summarization framework based on detecting important people in the video. The proposed method generates a compact summary of egocentric videos that contains information of the people whom the camera wearer interacts with. Our proposed approach focuses on identifying the interaction of camera wearer with important people. We have used AlexNet convolutional neural network to filter the key-frames (frames where camera wearer interacts closely with the people). We used five convolutional layers and two completely connected hidden layers and an output layer. Dropout regularization method is used to reduce the overfitting problem in completely connected layers. Performance of the proposed method is evaluated onUT Egostandard dataset. Experimental results signify the effectiveness of the proposed method in terms of summarizing the egocentric videos.

Highlights

  • The introduction of wearable cameras in 1990s by Steve Mann has revolutionized the IT industry and created a deep impact in our daily lives

  • UT Ego [24, 30, 48] is designed to measure the performance of egocentric video summarization approaches

  • UT Ego dataset comprises of four egocentric videos that are captured in uncontrolled environments

Read more

Summary

Introduction

The introduction of wearable cameras in 1990s by Steve Mann has revolutionized the IT industry and created a deep impact in our daily lives. To address the aforementioned challenges associated with the egocentric videos, there exists a need to propose effective and efficient methods to generate the summary of full-length lifelogging videos. The generation and transmission of vast amount of egocentric video content in the cyberspace have motivated the researchers to propose effective video summarization techniques for wearable camera data. Existing methods have used supervised learning techniques for summarization based on activity detection [13,14,15,16,17], object detection [18], and significant events detection [19]. Hwang et al [22] proposed a summarization technique based on identifying important objects and individuals interacted with the camera wearer. We proposed an effective egocentric video summarization method based on identifying the interaction of camera wearer with important people.

Materials and Methods
Fully Connected Layers input 227x227x3
Results and Discussion
Proposed Method KNN ELM
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.