Abstract

Recent years have seen a dramatic growth of natural language text data (e.g., web pages, news articles, scientific literature, emails, enterprise documents, blog articles, forum posts, product reviews, and tweets). Text data contain all kinds of knowledge about the world and human opinions and preferences, thus offering great opportunities for analyzing and mining vast amounts of text data (“big text data”) to support user tasks and optimize decision making in all application domains. However, computers cannot yet accurately understand unrestricted natural language; as such, involving humans in the loop of interactive text mining is essential. In this talk, I will present the vision of TextScope, an interactive software tool to enable users to perform intelligent information retrieval and text analysis in a unified task-support framework. Just as a microscope allows us to see things in the imicro world,i and a telescope allows us to see things far away, the envisioned TextScope would allow us to iseei useful hidden knowledge buried in large amounts of text data that would otherwise be unknown to us. As examples of techniques that can be used to build a TextScope, I will present some general statistical text mining algorithms that we have recently developed for joint analysis of text and non-text data to discover interesting patterns and knowledge. I will conclude the talk with a discussion of the major challenges in developing a TextScope and some important directions for future research in text data mining.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call