Exploratory Visual Analysis and Interactive Pattern Extraction from Semi-Structured Data

Axel J Soto,Evangelos Milios,Ryan Kiros,Vlado Kešelj

doi:10.1145/2812115

Abstract

Semi-structured documents are a common type of data containing free text in natural language (unstructured data) as well as additional information about the document, or meta-data, typically following a schema or controlled vocabulary (structured data). Simultaneous analysis of unstructured and structured data enables the discovery of hidden relationships that cannot be identified from either of these sources when analyzed independently of each other. In this work, we present a visual text analytics tool for semi-structured documents (ViTA-SSD), that aims to support the user in the exploration and finding of insightful patterns in a visual and interactive manner in a semi-structured collection of documents. It achieves this goal by presenting to the user a set of coordinated visualizations that allows the linking of the metadata with interactively generated clusters of documents in such a way that relevant patterns can be easily spotted. The system contains two novel approaches in its back end: a feature-learning method to learn a compact representation of the corpus and a fast-clustering approach that has been redesigned to allow user supervision. These novel contributions make it possible for the user to interact with a large and dynamic document collection and to perform several text analytical tasks more efficiently. Finally, we present two use cases that illustrate the suitability of the system for in-depth interactive exploration of semi-structured document collections, two user studies, and results of several evaluations of our text-mining components.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploratory Visual Analysis and Interactive Pattern Extraction from Semi-Structured Data

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Interactive Intelligent Systems

Lead the way for us

Journal: ACM Transactions on Interactive Intelligent Systems	Publication Date: Sep 8, 2015
Citations: 13

Similar Papers

Big Data, Predictive Analytics, and Quality Improvement in Kidney Transplantation: A Proof of Concept.
T.R Srinivas ... A Tripathi
American Journal of Transplantation | VOL. 17
T.R Srinivas, et. al.T.R Srinivas ... A Tripathi
04 Jan 2017
American Journal of Transplantation | VOL. 17

Extracting Structured Data from Text in Natural Language
Zheni Mincheva ... Anatoliy Antonov
International Journal of Intelligent Information Systems | VOL. 10
Zheni Mincheva, et. al.Zheni Mincheva ... Anatoliy Antonov
01 Jan 2020
International Journal of Intelligent Information Systems | VOL. 10

A Survey of Different Text Mining Techniques
...
IBMRD s Journal of Management & Research | VOL. 3
, et. al. ...
01 Jan 2014
IBMRD s Journal of Management & Research | VOL. 3

Analysis and evaluation of unstructured data: text mining versus natural language processing
F S Gharehchopogh ... Z A Khalifelu
-
F S Gharehchopogh, et. al.F S Gharehchopogh ... Z A Khalifelu
01 Oct 2011
01 Oct 2011

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploratory Visual Analysis and Interactive Pattern Extraction from Semi-Structured Data

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Interactive Intelligent Systems