Abstract
Multidimensional projection techniques can be employed to project datasets from a higher to a lower dimensional space (e.g., 2D space). These techniques can be used to present the relationships of dataset instances based on distance by grouping or separating clusters of instances in the projected space. Several works have used multidimensional projections to aid in the exploration of document collections. Even though the projection techniques can organize a dataset, the user needs to read each document to understand the cluster generation. Alternatively, techniques such as topic extraction or tag clouds can be employed to present a summary of the document contents. To minimize the exploratory work and to aid in cluster analysis, this work proposes a new hybrid visualization to show both document relationship and content in a single view, employing multidimensional projections to relate documents and tag clouds. We show the effectiveness of the proposed approach in the exploration of two document collections composed by world news.
Highlights
Nowadays, a large amount of textual data is produced from distinct sources, and organizing and exploring this amount of data is very difficult
To improve the exploration and analysis based on multidimensional projection techniques, Silva and Eler [4] proposed a hybrid visualization approach to map the instances’ similarities in
This paper presents an extension for the previous approach, in which we propose a hybrid visualization approach to map the document similarities in 2D space and to show tag clouds for each document, presenting the key terms of the textual data
Summary
A large amount of textual data is produced from distinct sources, and organizing and exploring this amount of data is very difficult. To improve the exploration and analysis based on multidimensional projection techniques, Silva and Eler [4] proposed a hybrid visualization approach to map the instances’ similarities in. The main contribution of this paper is to aid the exploration of textual datasets based on the proposed hybrid visualization approach, which maps the similarities and text content in a unique visualization. This approach uses all dataset attributes to generate the visual representation and Information 2018, 9, 129; doi:10.3390/info9060129 www.mdpi.com/journal/information.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have