Abstract

The high-dimensional nature of the textual data complicates the design of visualization tools to support exploration of large document corpora. In this article, we first argue that the Parallel Coordinates (PC) technique, which can map multidimensional vectors onto a 2D space in such a way that elements with similar values are represented as similar poly-lines or curves in the visualization space, can be used to help users discern patterns in document collections. The inherent reduction in dimensionality during the mapping from multidimensional points to 2D lines, however, may result in visual complications. For instance, the lines that correspond to clusters of objects that are separate in the multidimensional space may overlap each other in the 2D space; the resulting increase in the number of crossings would make it hard to distinguish the individual document clusters. Such crossings of lines and overly dense regions are significant sources of visual clutter, thus avoiding them may help interpret the visualization. In this article, we note that visual clutter can be significantly reduced by adjusting the resolution of the individual term coordinates by clustering the corresponding values. Such reductions in the resolution of the individual term-coordinates, however, will lead to a certain degree of information loss and thus the appropriate resolution for the term-coordinates has to be selected carefully. Thus, in this article we propose a controlled clutter reduction approach, called Parallel hierarchical Coordinates (or PhC ), for reducing the visual clutter in PC-based visualizations of text corpora. We define visual clutter and information loss measures and provide extensive evaluations that show that the proposed PhC provides significant visual gains (i.e., multiple orders of reductions in visual clutter) with small information loss during visualization and exploration of document collections.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.