Abstract

Multidimensional projection techniques can be employed to project datasets from a higher to a lower dimensional space (e.g., 2D space). These techniques can be used to present the relationships of dataset instances based on distance by grouping or separating clusters of instances in the projected space. Several works have used multidimensional projections to aid in the exploration of document collections. Even though the projection techniques can organize a dataset, the user needs to read each document to understand the cluster generation. Alternatively, techniques such as topic extraction or tag clouds can be employed to present a summary of the document contents. To minimize the exploratory work and to aid in cluster analysis, this work proposes a new hybrid visualization to show both document relationship and content in a single view, employing multidimensional projections to relate documents and tag clouds. We show the effectiveness of the proposed approach in the exploration of two document collections composed by world news.

Highlights

  • Nowadays, a large amount of textual data is produced from distinct sources, and organizing and exploring this amount of data is very difficult

  • To improve the exploration and analysis based on multidimensional projection techniques, Silva and Eler [4] proposed a hybrid visualization approach to map the instances’ similarities in

  • This paper presents an extension for the previous approach, in which we propose a hybrid visualization approach to map the document similarities in 2D space and to show tag clouds for each document, presenting the key terms of the textual data

Read more

Summary

Introduction

A large amount of textual data is produced from distinct sources, and organizing and exploring this amount of data is very difficult. To improve the exploration and analysis based on multidimensional projection techniques, Silva and Eler [4] proposed a hybrid visualization approach to map the instances’ similarities in. The main contribution of this paper is to aid the exploration of textual datasets based on the proposed hybrid visualization approach, which maps the similarities and text content in a unique visualization. This approach uses all dataset attributes to generate the visual representation and Information 2018, 9, 129; doi:10.3390/info9060129 www.mdpi.com/journal/information.

Background
Proposed Approach
Applications
Conclusions and Future Works
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call