Abstract

The increasing use of social media networks on handheld devices, especially smartphones with powerful built-in cameras, and the widespread availability of fast and high bandwidth broadband connections, added to the popularity of cloud storage, is enabling the generation and distribution of massive volumes of digital media, including images and videos. Such media is full of visual information and holds immense value in today’s world. The volume of data involved calls for automated visual content analysis systems able to meet the demands of practice in terms of efficiency and effectiveness. Deep learning (DL) has recently emerged as a prominent technique for visual content analysis. It is data-driven in nature and provides automatic end-to-end learning solutions without the need to rely explicitly on predefined handcrafted feature extractors. Another appealing characteristic of DL solutions is the performance they can achieve, once the network is trained, under practical constraints. This paper identifies eight problem domains which require analysis of visual artifacts in multimedia. It surveys the recent, authoritative, and the best performing DL solutions and lists the datasets used in the development of these deep methods for the identified types of visual analysis problems. This paper also discusses the challenges that the DL solutions face which can compromise their reliability, robustness, and accuracy for visual content analysis.

Highlights

  • In recent years, the availability of handheld devices with high storage capacity and with integrated cameras has caused a boom in the generation of digital media by individuals

  • This paper identified eight problems related to visual content analysis

  • For each class of problem, we reviewed the state-ofthe-art and acknowledged the best performing Deep Learning (DL) methods proposed in the literature

Read more

Summary

INTRODUCTION

The availability of handheld devices with high storage capacity (complemented by the cloud) and with integrated cameras has caused a boom in the generation of digital media (images and videos) by individuals. Once trained, deep methods can process data in seconds These advantages of DL make it an attractive option for visual content analysis. There are many high-quality in-depth surveys for specific problems in visual content analysis (e.g., [8]–[10]) They present deep architectures and solutions focusing on a particular visual task.

BACKGROUND
SHORTCOMINGS OF DEEP LEARNING
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.