Abstract

Analysts and journalists face the problem of having to deal with very large, heterogeneous, and multilingual data volumes that need to be analyzed, understood, and aggregated. Automated and simplified editorial and authoring process could significantly reduce time, labor, and costs. Therefore, there is a need for unified access to multilingual and multicultural news story material, beyond the level of a nation, ensuring context-aware, spatiotemporal, and semantic interpretation, correlating also and summarizing the interpreted material into a coherent gist. In this paper, we present a platform integrating multimodal analytics techniques, which are able to support journalists in handling large streams of real-time and diverse information. Specifically, the platform automatically crawls and indexes multilingual and multimedia information from heterogeneous resources. Textual information is automatically summarized and can be translated (on demand) into the language of the journalist. High-level information is extracted from both textual and multimedia content for fast inspection using concept clouds. The textual and multimedia content is semantically integrated and indexed using a common representation, to be accessible through a web-based search engine. The evaluation of the proposed platform was performed by several groups of journalists revealing satisfaction from the user side.

Highlights

  • The rapid expansion of information technologies and the low cost of recording media have made available large amounts of multilingual and multimedia content

  • We propose a unique platform for journalists integrating multimodal analytics techniques, crawling and indexing of multilingual and multimedia information from heterogeneous resources, text summarization, and translation into the language of the journalist, named entity, and concept extraction from textual and multimedia content for fast inspection using word clouds, and semantic analysis of the textual content

  • When someone clicks on an entity, it is added to the search query, which can be used to modify the desired search

Read more

Summary

INTRODUCTION

The rapid expansion of information technologies and the low cost of recording media have made available large amounts of multilingual and multimedia content. The plethora of sources, combined with the language barriers restrict journalists to limited media resources, leaving the population in each of any encapsulated area in its own “filtered bubble”—without the realistic opportunity to understand the perspective developed in another area or country To break this isolation, the development of new technologies is required with innovative and effective functionalities to provide integrated access to multilingual and multicultural news articles across nations, to ensure its context-aware, spatiotemporal, semantic interpretation, and to correlate and summarize the collected content into a coherent whole. We propose a unique platform for journalists integrating multimodal analytics techniques, crawling and indexing of multilingual and multimedia information from heterogeneous resources, text summarization, and translation into the language of the journalist, named entity, and concept extraction from textual and multimedia content for fast inspection using word clouds, and semantic analysis of the textual content. In section Infrastructure we present system-centered and user-centered evaluation of the platform

RELATED WORK
EVALUATION
Results
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.